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0.1 Introduction 

This book represents course notes for a one semester course at the undergraduate 
level giving an introduction to Ricmannian geometry and its principal physical 
application, Einstein's theory of general relativity. The background assumed is 
a good grounding in linear algebra and in advanced calculus, preferably in the 
language of differential forms. 

Chapter I introduces the various curvatures associated to a hypersurfacc 
embedded in Euclidean space, motivated by the formula for the volume for 
the region obtained by thickening the hypersurface on one side. If we thicken 
the hypersurface by an amount h in the normal direction, this formula is a 
polynomial in h whose coefficients are integrals over the hypersurface of local 
expressions. These local expressions arc elementary symmetric polynomials in 
what are known as the principal curvatures. The precise definitions are given in 
the text. The chapter culminates with Gauss' Theorema egregium which asserts 
that if we thicken a two dimensional surface evenly on both sides, then the these 
integrands depend only on the intrinsic geometry of the surface, and not on how 
the surface is embedded. We give two proofs of this important theorem. (We 
give several more later in the book.) The first proof makes use of "normal coor- 
dinates" which become so important in Riemannian geometry and, as "inertial 
frames," in general relativity. It was this theorem of Gauss, and particularly 
the very notion of "intrinsic geometry" , which inspired Riemann to develop his 
geometry. 

Chapter II is a rapid review of the differential and integral calculus on man- 
ifolds, including differential forms, the d operator, and Stokes' theorem. Also 
vector fields and Lie derivatives. At the end of the chapter are a series of sec- 
tions in exercise form which lead to the notion of parallel transport of a vector 
along a curve on a embedded surface as being associated with the "rolling of 
the surface on a plane along the curve" . 

Chapter III discusses the fundamental notions of linear connections and their 
curvatures, and also Cartan's method of calculating curvature using frame fields 
and differential forms. We show that the geodesies on a Lie group equipped with 
a bi-invariant metric are the translates of the one parameter subgroups. A short 
exercise set at the end of the chapter uses the Cartan calculus to compute the 
curvature of the Schwartzschild metric. A second exercise set computes some 
geodesies in the Schwartzschild metric leading to two of the famous predictions 
of general relativity: the advance of the perihelion of Mercury and the bending 
of light by matter. Of course the theoretical basis of these computations, i.e. 
the theory of general relativity, will come later, in Chapter VII. 

Chapter IV begins by discussing the bundle of frames which is the modern 
setting for Cartan's calculus of "moving frames" and also the jumping off point 
for the general theory of connections on principal bundles which lie at the base 
of such modern physical theories as Yang-Mills fields. This chapter seems to 
present the most difficulty conceptually for the student. 

Chapter V discusses the general theory of connections on fiber bundles and 
then specialize to principal and associated bundles. 
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Chapter VI returns to Riemannian geometry and discusses Gauss's lemma 
which asserts that the radial geodesies emanating from a point are orthogo- 
nal (in the Ricmann metric) to the images under the exponential map of the 
spheres in the tangent space centered at the origin. From this one concludes 
that geodesies (defined as self parallel curves) locally minimize arc length in a 
Riemann manifold. 

Chapter VII is a rapid review of special relativity. It is assumed that the 
students will have seen much of this material in a physics course. 

Chapter VIII is the high point of the course from the theoretical point of 
view. We discuss Einstein's general theory of relativity from the point of view of 
the Einstcin-Hilbcrt functional. In fact we borrow the title of Hilbert's paper for 
the Chapter heading. We also introduce the principle of general covariance, first 
introduce by Einstein, Infold, and Hoffmann to derive the "geodesic principle" 
and give a whole series of other applications of this principle. 

Chapter IX discusses computational methods deriving from the notion of 
a Riemannian submersion, introduced and developed by Robert Hermann and 
perfected by Barrett O'Neill. It is the natural setting for the generalized Gauss- 
Codazzi type equations. Although technically somewhat demanding at the be- 
ginning, the range of applications justifies the effort in setting up the theory. 
Applications range from curvature computations for homogeneous spaces to cos- 
mogony and eschatology in Friedman type models. 

Chapter X discusses the Petrov classification, using complex geometry, of the 
various types of solutions to the Einstein equations in four dimensions. This 
classification led Kerr to his discovery of the rotating black hole solution which 
is a topic for a course in its own. The exposition in this chapter follows joint 
work with Kostant. 

Chapter XI is in the form of a enlarged exercise set on the star operator. It 
is essentially independent of the entire course, but I thought it useful to include, 
as it would be of interest in any more advanced treatment of topics in the course. 
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Chapter 1 

The principal curvatures. 



1.1 Volume of a thickened hypersurface 

We want to consider the following problem: Let Y C R™ be an oriented hyper- 
surface, so there is a well defined unit normal vector, v(y), at each point of Y. 
Let Yh denote the set of all points of the form 

y + tv{y), 0<t<h. 

We wish to compute V n (Y h ) where V n denotes the n— dimensional volume. We 
will do this computation for small h, see the discussion after the examples. 



Examples in three dimensional space. 

1. Suppose that Y is a bounded region in a plane, of area A. Clearly 

V 3 (Y h ) = hA 

in this case. 

2. Suppose that Y is a right circular cylinder of radius r and height I with 
outwardly pointing normal. Then Y] x is the region between the right circular 
cylinders of height £ and radii r and r + h so 

V 3 (Y h ) = irt[(r + h) 2 - r 2 } 
= 2-Kirh + Trih 2 

= hA + h 2 ~-A 
2r 

= A^h+^-kh 2 

where A = 2nr£ is the area of the cylinder and where k = 1/r is the curvature of 
the generating circle of the cylinder. For small h, this formula is correct, in fact, 
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whether we choose the normal vector to point out of the cylinder or into the 
cylinder. Of course, in the inward pointing case, the curvature has the opposite 
sign, k = —1/r. 

For inward pointing normals, the formula breaks down when h > r, since we 
get multiple coverage of points in space by points of the form y + tv{y) . 

3. Y is a sphere of radius R with outward normal, so Yh is a spherical shell, 
and 



where A = AttR 2 is the area of the sphere. 

Once again, for inward pointing normals we must change the sign of the 
coefficient of h 2 and the formula thus obtained is only correct for h < j^. 

So in general, we wish to make the assumption that h is such that the map 



is injective. For Y compact, there always exists an ho > such that this 
condition holds for all h < h . This can be seen to be a consequence of the 
implicit function theorem. But so not to interrupt the discussion, we will take 
the injectivity of the map as an hypothesis, for the moment. 

In a moment we will define the notion of the various averaged curvatures, 
Hi,..., H n -i, of a hypersurface, and find for the case of the sphere with outward 
pointing normal, that 



V 3 (Y h ) = K[(R + h) 3 -R 3 } 




- r A -\ 3h+3 ii- h2+ w h3 



Yx[0,h]^K n , (y,t)^y + tv(y) 




while for the case of the cylinder with outward pointing normal that 



Hi=*-, H 2 = 
2r 



and for the case of the planar region that 



Hi = H 2 = 0. 



We can thus write all three of the above the above formulas as 



V 3 (Y h ) = -A [3h + 3ff^ 2 + H 2 h 3 ] 
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1.2 The Gauss map and the Weingarten map. 

In order to state the general formula, we make the following definitions: Let 
Y be an (immersed) oriented hypersurface. At each x e Y there is a unique 
(positive) unit normal vector, and hence a well defined Gauss map 



assigning to each point x e Y its unit normal vector, u(x). Here S 1 ™ -1 denotes 
the unit sphere, the set of all unit vectors in R™. 

The normal vector, v(x) is orthogonal to the tangent space to Y at x. We 
will denote this tangent space by TY X . For our present purposes, we can regard 
TY X as a subspace of R": If t i— > -f(t) is a differentiable curve lying on the 
hypersurface Y, (this means that 7(t) € Y for all i) and if 7(0) = x, then 7'(0) 
belongs to the tangent space TY X . Conversely, given any vector v G TY X , we 
can always find a differentiable curve 7 with 7(0) = x, j'(0) — v. So a good 
way to think of a tangent vector to Y at x is as an "infinitesimal curve" on Y 
passing through x. 

Examples: 

1. Suppose that Y is a portion of an (n — 1) dimensional linear or affine sub- 
space space sitting in R n . For example suppose that Y = R n_1 consisting 
of those points in R" whose last coordinate vanishes. Then the tangent 
space to Y at every point is just this same subspace, and hence the normal 
vector is a constant. The Gauss map is thus a constant, mapping all of Y 
onto a single point in S*™ -1 . 

2. Suppose that Y is the sphere of radius R (say centered at the origin). The 
Gauss map carries every point of Y into the corresponding (parallel) point 
of S*™ -1 . In other words, it is multiplication by 1/R: 



3. Suppose that Y is a right circular cylinder in R 3 whose base is the circle 
of radius r in the x 1 ,^ 2 plane. Then the Gauss map sends Y onto the 
equator of the unit sphere, S 2 , sending a point x into (l/r)7r(a;) where 
7r : R 3 — > R 2 is projection onto the x 1 , x 2 plane. 

Another good way to think of the tangent space is in terms of a local 
parameterization which means that we are given a map X : M i— > R™ where 
M is some open subset of R" _1 and such that X(M) is some neighborhood of 
x in Y. Let y 1 , . . . be the standard coordinates on R n_1 . Part of the 

requirement that goes into the definition of parameterization is that the map X 
be regular, in the sense that its Jacobian matrix 



v : Y 



S 



'71—1 



v {y) = -5V- 
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whose columns are the partial derivatives of the map X has rank n — 1 every- 
where. The matrix dX has n rows and n — 1 columns. The regularity condition 
amounts to the assertion that for each z € M the vectors, 

span a subspace of dimension n—1. If x = X(y) then the tangent space TY X is 
precisely the space spanned by 

dx dx 
Oy-M, ■ ■ .^5=r(»)- 

Suppose that F is a differentiable map from Y to R m . We can then define 
its differential, dF x : TY X R"\ It is a linear map assigning to each v € TY X 
a value dF x (v) € R m : In terms of the "infinitesimal curve" description, if 
v = y(0) then 

dF x (v) = ^£^(0). 

(You must check that this does not depend on the choice of representing curve, 
7-) 

Alternatively, to give a linear map, it is enough to give its value at the 
elements of a basis. In terms of the basis coming from a parameterization, we 
have 

'dX, A dFoX 



Here F o X : M — > R m is the composition of the map F with the map X . You 
must check that the map dF x so determined does not depend on the choice of 
parameterization. Both of these verifications proceed by the chain rule. 

One immediate consequence of either characterization is the following im- 
portant property. Suppose that F takes values in a submanifold Z C R m . 
Then 

dF x : TY X —> TZ F ( x y 

Let us apply all this to the Gauss map, v, which maps Y to the unit sphere, 
S"- 1 . Then 

du x :TY x ^TS^ x ]. 

But the tangent space to the unit sphere at v(x) consists of all vectors 
perpendicular to v{x) and so can be identified with TY X . We define the Wein- 
garten map to be the differential of the Gauss map, regarded as a map from 
TY X to itself: 

W x :=du x , Wt : TY r — > TY r . 



The second fundamental form is defined to be the bilinear form on TY X 
given by 

II x (v,w) := (W x v,w). 
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In the next section we will show, using local coordinates, that this form is 
symmetric, i.e. that 

(W x u,v) = (u,W x v). 

This implies, from linear algebra, that W x is diagonizable with real eigenvalues. 
These eigenvalues, k\ = fci(x), • • • , fc n -i = k n -\(x), of the Weingarten map are 
called the principal curvatures of Y at the point x. 

Examples: 

1. For a portion of (n — 1) space sitting in R n the Gauss map is constant 
so its differential is zero. Hence the Weingarten map and thus all the 
principal curvatures are zero. 

2. For the sphere of radius R the Gauss map consists of multiplication by 1/R 
which is a linear transformation. The differential of a linear transformation 
is that same transformation (regarded as acting on the tangent spaces). 
Hence the Weingarten map is l/i?xid and so all the principal curvatures 
are equal and are equal to 1/R. 

3. For the cylinder, again the Gauss map is linear, and so the principal 
curvatures are and 1/r. 

We let Hj denote the jth normalized elementary symmetric functions of the 
principal curvatures. So 

H = 1 

#1 = — !- r (fcl + -" + kn-l) 

n — 1 

H n _i = fci • k 2 ■ ■ ■ fc„-i 

and, in general, 

H 3 = ( n - 1 ) ]T a,,. (1.1) 

^ ' l<i 1 <---<i j <n-l 

Hi is called the mean curvature and H n _\ is called the Gaussian curvature. 

All the principal curvatures are functions of the point x e Y. For notational 
simplicity, we will frequently suppress the dependence on x. Then the formula 
for the volume of the thickened hypersurface (we will call this the "volume 
formula" for short) is: 

V ^ = ^E(i) /i 'X Hi - i<r ~ lA (L2) 

where d n ~ 1 A denotes the (n — 1 dimensional) volume (area) measure on Y. 

A immediate check shows that this gives the answers that we got above for 
the the plane, the cylinder, and the sphere. 
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1.3 Proof of the volume formula. 

We recall that the Gauss map, v assigns to each point x € Y its unit normal 
vector, and so is a map from Y to the unit sphere, S*™ -1 . The Weingarten map, 
W x , is the differential of the Gauss map, W x — dv x , regarded as a map of the 
tangent space, TY X to itself. We now describe these maps in terms of a local 
parameterization of Y. So let X : M — > R™ be a parameterization of class 
C 2 of a neighborhood of Y near x, where M is an open subset of R™ -1 . So 
x = X(y), y e M, say. Let 

N := voX 

so that N : M — > S"^ 1 is a map of class C 1 . The map 

dXy : R n ^ — > TY X 

gives a frame of TY X . The word "frame" means an isomorphism of our "stan- 
dard" (n— l)-dimensional space, R Il_1 with our given {n— l)-dimcnsional space, 
TY X . Here we have identified T(R" _1 ) !y with R" -1 , so the frame dX y gives us 
a particular isomorphism of R™ _1 with TY X . 

Giving a frame of a vector space is the same as giving a basis of that vector 
space. We will use these two different ways of using the word" frame" inter- 
changeably. Let ei,...,e n _i denote the standard basis of R" _1 , and for X 
and N, let the subscript i denote the partial derivative with respect to the ith 
Cartesian coordinate. Thus 



for example, and so Xi(y), . . . , X n _\(y) "is" the frame determined by dX y 
(when we regard TY X as a subspace of R"). For the sake of notational sim- 
plicity we will drop the argument y. Thus we have 



Recall the definition, II x (v,w) = (W x v,w), of the second fundamental form. 
Let (Lij) denote the matrix of the second fundamental form with respect to the 
basis Xu.-.X^ of TY X . So 



dX y (e t ) = Xi(y) 



dX{ei) 
dN{ ei ) 




Ni. 



Hx(Xi, Xj) 
(W x X u Xj) 
(NuXj) 



so 



(N, 



d 2 x 

dyidyj 



(1.3) 
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the last equality coming from differentiating the identity 

(N,Xj) = 

in the ith direction. In particular, it follows from (1.3) and the equality of cross 
derivatives that 

(W X X U X 3 ) = {Xi,W x Xi) 
and hence, by linearity that 

(W x u, v) = (u, W x v) Vu, v e TY X . 

We have proved that the second fundamental form is symmetric, and hence the 
Weingarten map is diagonizable with real eigenvalues. 

Recall that the principal curvatures are, by definition, the eigenvalues of the 
Weingarten map. We will let 

W = (Wij) 

denote the matrix of the Weingarten map with respect to the basis X\ , . . . , X n _ 1 . 
Explicitly, 

3 

If we write Ni, . . . , N n _\,X\, . . . , X n _\ as column vectors of length n, we can 
write the preceding equation as the matrix equation 

(N 1 ,...,N n _ 1 ) = (X 1 ,...,X n _ 1 )W. (1.4) 

The matrix multiplication on the right is that of an n x (n — 1) matrix with an 
(n — 1) x (n — 1) matrix. To understand this abbreviated notation, let us write 
it out in the case n = 3, so that Xi,X 2 ,Ni,N 2 are vectors in R 3 : 



Xi 





Then (1.4) is the matrix equation 

JVn JV21 \ / ^11 x 21 

N\2 N22 I = I X12 X22 

N13 N23 I \ -^13 -^23 



W21 W22 



Matrix multiplication shows that this gives 

Ni = W 11 X 1 + W 21 X 2 , N 2 = W 12 X 1 + W22X2, 

and more generally that (1.4) gives iVj = J2j WjiXj in all dimensions. 

Now consider the region Yh, the thickened hypersurface, introduced in the 
preceding section except that we replace the full hypersurface Y by the portion 
X(M). Thus the region in space that we are considering is 

{X(y) + \N(y),yeM , < A < h}. 
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It is the image of the region M x (0, h] C R" 1 x R under the map 

(y,\)~X(y)+\N(y). 

We are assuming that this map is injective. By (1.4), it has Jacobian matrix 
(differential) 

J=(X 1 + XN 1 ,.. . ,X„_! + XN n - U N) = 

(.V, X, „ AU! J). (1.5) 

The right hand side of (1.5) is now the product of two nby n matrices. 
The change of variables formula in several variables says that 

V n (h)= [ [ |det J\dhdy 1 ---dy n - 1 . (1.6) 

JM JO 

Let us take the determinant of the right hand side of (1.5). The determinant 
of the matrix (Xi, . . . , X n -i, N) is just the (oriented) n dimensional volume of 
the parallelepiped spanned by X\, . . . , X n _\, N. Since N is of unit length and 
is perpendicular to the X's, this is the same as the (oriented) n—1 dimensional 
volume of the parallelepiped spanned by . . ,X n _i. Thus, "by definition", 

\det(X 1 ,...,X n _ 1 ,N)\dy 1 ---dy n - 1 =d n ~ 1 A. (1.7) 

(We will come back shortly to discuss why this is the right definition.) The 
second factor on the right hand side of (1.5) contributes 

det(l + \W) = (1 + Xh) ■■■(! + Afc„_!). 

For sufficiently small A, this expression is positive, so we need not worry about 
the absolute value sign if h small enough. Integrating with respect to A from 
to h gives (1.2). 

We proved (1.2) if we define d n ~ 1 A to be given by (1.7). But then it follows 
from (1.2) that 

^V n (Y h ) lh=0 = \ y d n - X A. (1.8) 

A moment's thought shows that the left hand side of (1.8) is exactly what we 
want to mean by "area" : it is the "volume of an infinitcsimally thickened region" . 
This justifies taking (1.7) as a definition. Furthermore, although the definition 
(1.7) is only valid in a coordinate neighborhood, and seems to depend on the 
choice of local coordinates, equation (1.8) shows that it is independent of the 
local description by coordinates, and hence is a well defined object on Y. The 
functions Hj have been defined independent of any choice of local coordinates. 
Hence (1.2) works globally: To compute the right hand side of (1.2) we may 
have to break Y up into patches, and do the integration in each patch, summing 
the pieces. But we know in advance that the final answer is independent of how 
we break Y up or which local coordinates we use. 
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1.4 Gauss's theorema egregium. 

Suppose we consider the two sided region about the surface, that is 

V n (Y+) + V n (Y h -) 

corresponding to the two different choices of normals. When we replace v(x) by 
— v{x) at each point, the Gauss map v is replaced by —v, and hence the Wein- 
garten maps W x are also replaced by their negatives. The principal curvatures 
change sign. Hence, in the above sum the coefficients of the even powers of h 
cancel, since they are given in terms of products of the principal curvatures with 
an odd number of factors. For n — 3 we are left with a sum of two terms, the 
coefficient of h which is the area, and the coefficient of h 3 which is the integral 
of the Gaussian curvature, ft was the remarkable discovery of Gauss that this 
curvature depends only on the intrinsic geometry of the surface, and not on 
how the surface is embedded into three space. Thus, for both the cylinder and 
the plane the cubic terms vanish, because (locally) the cylinder is isometric to 
the plane. We can wrap the plane around the cylinder without stretching or 
tearing. 

ft was this fundamental observation of Gauss that led Ricmann to investigate 
the intrinsic metric geometry of higher dimensional space, eventually leading 
to Einstein's general relativity which derives the gravitational force from the 
curvature of space time. A first objective will be to understand this major 
theorem of Gauss. 

An important generalization of Gauss's result was proved by Hermann Weyl 
in 1939. He showed: if Y is any k dimensional submanifold of n dimensional 
space (so for k = 1, n = 3 Y is a curve in three space), let Y(h) denote the 
"tube" around Y of radius h, the set of all points at distance h from Y. Then, 
for small h, V n (Y(h)) is a polynomial in h whose coefficients are integrals over 
Y of intrinsic expressions, depending only on the notion of distance within Y. 

Let us multiply both sides of (1.4) on the left by the matrix (X\, . . . , X„_i) T 
to obtain 

L = QW 

where = (Xi,Nj) as before, and 

Q = (Qij) := (Xi, Xj) 

is called the matrix of the first fundamental form relative to our choice of 
local coordinates. All three matrices in this equality are of size (n— 1) x (n — 1). 
If we take the determinant of the equation L = QW we obtain 

detW= ^ (L9) 
det Q ' K ' 

an expression for the determinant of the Weingarten map (a geometrical prop- 
erty of the embedded surface) as the quotient of two local expressions. For the 
case n — 1 = 2, we thus obtain a local expression for the Gaussian curvature, 
K = det W. 
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The first fundamental form encodes the intrinsic geometry of the hypcrsur- 
face in terms of local coordinates: it gives the Euclidean geometry of the tangent 
space in terms of the basis X\, . . . ,X n _i. If we describe a curve t 7(f) on 



■ ,y" 



the surface in terms of the coordinates y 1 
t*—>y J (t), j = 1 , . . . , n — 1 then the chain rule says that 

n-1 



by giving the functions 



Y(t) = £*i(f(*))^(*) 



where 



y(t) = (y 1 (t),...,y n -\t)). 
Therefore the (Euclidean) square length of the tangent vector j'(t) is 

n-1 

0,An(tX\ 

dt y 1 dt 



Thus the length of the curve 7 given by 
can be computed in terms of y(t) as 



/ 



\ 



dt y ' dt 



dt 



(so long as the curve lies within the coordinate system). 

So two hypersurfaces have the same local intrinsic geometry if they have the 
same Q in any local coordinate system. 

In order to conform with a (somewhat variable) classical literature, we shall 
make some slight changes in our notation for the case of surfaces in three di- 
mensional space. We will denote our local coordinates by u, v instead of y\, yi 
and so X u will replace X\ and X v will replace X 2 , and we will denote the scalar 
product of two vectors in three dimensional space by a • instead of ( , ) . We 
write 



Q 


- (;s) 


(1.10) 


where 






E 


'■= x u ■ X u 


(1.11) 


F 


:= X u ■ X v 


(1.12) 


G 


:= Xy ■ Xy 


(1.13) 


so 






detQ 


= EG - F 2 . 


(1.14) 
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We can write the equations (1.11)-(1.13) as 

Q = {x u , x v )\x u , x v ). 

Similarly, let us set 



e 


:= N 




/ 


:= N 




9 


:= N 




so 






L 




( e 

V / 


and 






detL 


= eg 


-f- 



(1.15) 

(1.16) 
(1.17) 
(1.18) 

(1.19) 



(1.20) 



Hence (1.9) specializes to 

K= e *- f2 
EG - F 2 ' 

an expression for the Gaussian curvature in local coordinates. We can make 
this expression even more explicit, using the notion of vector product. Notice 
that the unit normal vector, N is given by 

1 



N 



tX u x X v 



and 

Therefore 



, \X U x X Vii 

\x u x x v \\ = VII^ M || 2 II^II 2 - (x u ■ x v y = Veg-f*. 



= N-X u , 
1 



VEG - F 2 
1 

^EG - F 2 



X u 



(X u x X v ) 



det (X uu , X u , X v ) , 



This last determinant, is the the determinant of the three by three matrix whose 
columns are the vectors X UU ,X U and X v . Replacing the first column by X uv 
gives a corresponding expression for /, and replacing the first column by X vv 
gives the expression for g. Substituting into (1.20) gives 



K = 



det(X uu , X u , X v ) det(X vv , X u , X v ) — det(X uv , X u , X„)' 



{{X U • X U )(X V ■ Xy) — (X U ■ Xy) 
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(1.21) 



This expression is rather complicated for computation by hand, since it 
involves all those determinants. However a symbolic manipulation program such 
as maple or mathematica can handle it with ease. Here is the instruction for 
mathematica, taken from a recent book by Gray (1993), in terms of a function 
X[u,v] defined in mathematica: 
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gcurvaturc [X_] [u_, v_] : =Simplify [ 

(Det[D[X[uu,w],uu,uu],D[X[uu,w],uu],D[X[uu,w],w]]* 
Det[D[X[uu,w],w,w],D[X[uu,w],uu],D[X[uu,w],w]]- 
Det[D[X[uu,w],uu,w],D[X[uu,w],uu],D[X[uu,w],w]]^2)/ 
(D [X[uu,vv] ,uu] .D[X[uu,vv] ,uu] * 
D[X[uu,vv] ,vv] .D [X[uu,vv] ,vv]- 

D[X[uu,w],uu].D[X[uu,w],wp2)*2] /. uu->u,vv->v 



We are now in a position to give two proofs, both correct but both somewhat 
unsatisfactory of Gauss's Theorema egregium which asserts that the Gaussian 
curvature is an intrinsic property of the metrical character of the surface. How- 
ever each proof does have its merits. 



1.4.1 First proof, using inertial coordinates. 

For the first proof, we analyze how the first fundamental form changes when 
we change coordinates. Suppose we pass from local coordinates u, v to local 
coordinates u', v' where u = u(u', v'), v = v(u' , v'). Expressing X as a function 
of u' , v' and using the chain rule gives, 

X u > 

Xyl 

(X u > j X v > ) 



du dv 

du> Xu+ du> Xv 

du du ^ 

^-, x u + ^-,X V or 
dv' ov' 

(X U ,X V )J where 



/ du du \ 
7 — [ du' dv' I 
,J -~ \ dv_ dv_ ] 
\ du' dv' / 

SO 

Q! = {x u i,x v /) (x u i,x v i) 
= j f gj. 

This gives the rule for change of variables of the first fundamental form from the 
unprimed to the primed coordinate system, and is valid throughout the range 
where the coordinates are defined. Here J is a matrix valued function of u',v'. 

Let us now concentrate attention on a single point, P. The first fundamental 
form is a symmetric postive definite matrix. By linear algebra, we can always 
find a matrix R such that B) Q(up, v p )R — I, the two dimensional identity ma- 
trix. Here (up,v P ) are the coordinates describing P. With no loss of generality 
we may assume that these coordinates are (0,0). We can then make the lin- 
ear change of variables whose J(0,0) is R, and so find coordinates such that 
Q(0, 0) = / in this coordinate system. But we can do better. We claim that we 
can choose coordinates so that 



(1.22) 
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Indeed, suppose we start with a coordinate system with Q(0) = I, and look for a 
change of coordinates with J(0) = /, hoping to determine the second derivatives 
so that (1.22) holds. Writing Q' = J^QJ and using Leibniz's formula for the 
derivative of a product, the equations become 

--£<•> 

when we make use of J(0) — I. Writing out these equations gives 

/ o d 2 u d 2 u I d 2 v \ 

v' ^ (du') 2 \ fn\ °V i 



"{du') 2 du'dv' ^ {du') 2 

d 2 u , d 2 v o d 2 v 



(0) = --£(0) 



du'dv' ~ (du') 2 du'dv' 



''du'dv' (dv') 2 du'dv' 



\ (aw') 2 + du'dv' z (au') 2 / 

The lower right hand corner of the first equation and the upper left hand corner 
of the second equation determine 



d 2 v 



7(0) and £^L(0). 



du'dv' du'dv' 

All of the remaining second derivatives are then determined (consistently since 
Q is a symmetric matrix). We may now choose u and v as functions of u',v'. 
which vanish at (0,0) together with all their first partial derivatives, and with 
the second derivatives as above. For example, we can choose the u and v as 
homogeneous polynomials in u' and v' with the above partial derivatives. A 
coordinate system in which (1.22) holds (at a point P having coordinates (0, 0)) 
is called an inertial coordinate system based at P. Obviously the collection 
of all inertial coordinate systems based at P is intrinsically associated to the 
metric, since the definition depends only on properties of Q in the coordinate 
system. We now claim the following 

Proposition 1 If u,v is an inertial coordinate system of an embedded surface 
based at P then then the Gaussian curvature is given by 

K{P) = F uv — -G uu — -E vv (1-23) 

the expression on the right being evaluated at (0,0). 

As the collection of inertial systems is intrinsic, and as (1.23) expresses the 
curvature in terms of a local expression for the metric in an inertial coordinate 
system, the proposition implies the Theorema egregium. 

To prove the proposition, let us first make a rotation and translation in three 
dimensional space (if necessary) so that X(P) is at the origin and the tangent 
plane to the surface at P is the x, y plane. The fact that Q(0) — I implies 
that the vectors X u (0),X v (0) form an orthonormal basis of the x,y plane, so 
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by a further rotation, if necessary, we may assume that X u is the unit vector 
in the positive x— direction and by replacing v by — v if necessary, that X v is 
the unit vector in the positive y direction. These Euclidean motions we used do 
not change the value of the determinant of the Wcingarten map and so have no 
effect on the curvature. If we replace v by — v , E and G are unchanged and G uu 
or E vv are also unchanged. Under the change v — v F goes to —F, but the 
cross derivative F uv picks up an additional minus sign. So F uv is unchanged. 
We have arranged that we need prove (1-23) under the assumptions that 

(u + r(u, v) \ 
v + s(u,v) , 
f(u,v) J 

where r, s, and / are functions which vanish together with their first derivatives 
at the origin in it, v space. So far we have only used the property Q(0) = /, not 
the full strength of the definition of an inertial coordinate system. We claim 
that if the coordinate system is inertial, all the second partials of r and s also 
vanish at the origin. To see this, observe that 



E 


= (l + r u f + sl + f u 


F 


Vy H~ TuTy H~ s u H~ S U S V + fufv 


G 


- rl + (1 + Sv f + Si so 


E u (0) 


= 2r UM (0) 


E v (0) 


= 2r uv (0) 


F u (0) 


= r uv (0) + s uu (0) 


F v (0) 


= r vv (0) + s uv (0) 


G u (0) 


= 2s uv (0) 


G v (0) 


= 2s vv (0). 



The vanishing of all the first partials of E, F, and G at thus implies the 
vanishing of second partial derivatives of r and s. 

By the way, turning this argument around gives us a geometrically intuitive 
way of constructing inertial coordinates for an embedded surface: At any point 
P choose orthonormal coordinates in the tangent plane to P and use them to 
parameterize the surface. (In the preceding notation just choose x = u and 
y = v as coordinates.) 

Now N(0) is just the unit vector in the positive z— direction and so 

6 fuu 
f fuv 
9 fvv 
SO 

K = fuu fvv ~~ fuv 

(all the above meant as values at the origin) since EG — F 2 = 1 at the origin. 
On the other hand, taking the partial derivatives of the above expressions for 



1.4. GAUSS'S THEOREM A EGREGIUM. 



25 



E, F and G and evaluating at the origin (in particular discarding terms which 
vanish at the origin) gives 

E U v — r uvv ~t~ s uuv fuufvv ~t~ fuv 

E V v — 2 [r uult -\- f U v\ 
G U u — 2 [s uulJ ~t- fuv\ 

when evaluated at (0,0). So (1.23) holds by direct computation. 



1.4.2 Second proof. The Brioschi formula. 

Since the Gaussian curvature depends only on the metric, we should be able to 
find a general formula expressing the Gaussian curvature in terms of a metric, 
valid in any coordinate system, not just an inertial system. This we shall do by 
massaging (1.21). The numerator in (1.21) is the difference of products of two 
determinants. Now det£> = detB''' so det^ldeti? = det AB^ and we can write 
the numerator of (1.21) as 

X U u ' X vv X U u ' X u X uu ■ X v \ / X U v ' X U v X U v ' X u X uv • X v 

det I X u ■ X vv X u ■ X u X u ■ X v I — det I X u ■ X{ v X u ■ X u X u ■ X v 

x v ■ x v v x v ■ x u x v ■ x v j y x v ■ x u v x v ■ x u x v ■ x v 

All the terms in these matrices except for the entries in the upper left hand 
corner of each is either a term of the form E, F, or G or expressible as in 
terms of derivatives of E, F and G. For example, X uu ■ X u = \E U and F u = 
X U u • X v + X u ■ X uv so X uu ■ X v = F u — \E V and so on. So if not for the terms 
in the upper left hand corners, we would already have expressed the Gaussian 
curvature in terms of E, F and G. So our problem is how to deal with the two 
terms in the upper left hand corner. Notice that the lower right hand two by 
two block in these two matrices are the same. So (expanding both matrices 
along the top row, for example) the difference of the two determinants would be 
unchanged if we replace the upper left hand term, X uu • X vv in the first matrix 
by X uu • X vv — X uv • X uv and the upper left hand term in the second matrix by 
0. We now show how to express X UU X VV — X uv • X uv in terms of E, F and G 
and this will then give a proof of the Theorema egregium. We have 

Xuu ' X V v X U v ' X U v (Xu ' X V v)u X u • X V vu 

-(X u • X uv ) u + X u • Xuvv 
{X u • X vv ) u {X u • X uv ) v 

{{X u • X v ^j u — X uv • X v ~) u — — (X u • X u ^ vv 

= (X U • X V )VU — ~^{Xv • Xy) UU — ^{Xu • X U )y V 

2~^vv F uv — — G U u- 
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We thus obtain Brioschi's formula 



K 



A = 



B 



dct A — dct B 
(EG - F) 2 

-E + F 
F — -G 

Iff 



where 



2 ^uu 



(1.24) 



E 



F — -E 

± u 2 v 

F 
G 



2 v 
2 K ~ r u 



±E iG 



E 
F 



F 
G 



Brioschi's formula is not fit for human use but can be fed to machine if necessary. 
It does give a proof of Gauss' theorem. Notice that if we have coordinates 
which arc incrtial at some point, P, then Brioschi's formula reduces to (1.23) 
since E = G = 1,F = and all first partials vanish at P. We will reproduce 
a mathematica program for Brioschi's formula from Gray at the end of this 
section. 

In case we have orthogonal coordinates, a coordinate system in which 
F = 0, Brioschi's formula simplifies and becomes useful: If we set F — F u — 
F v = in Brioschi's formula and expand the determinants we get 



1 



(EG)" 1 



1 



E„ 



1 



2 

1 E 

1 J-^vv 



G h 



EG + -E U G U G + -E V G V E + ~^E^G + -^G^E 



1 E v 1 E V G V 
2 EG AEHJ 4 EG 2 



+ 



1 Guu 1 G„ 

2 EG AEG 2 



_l_ 1 E U G U 



4 E 2 G 



We claim that the first bracketed expression can be written as 

1 d ( 1 8Ve\ 



VG dv 



Indeed, 





AG 2 E AE 2 G ' 2EG' 
Doing a similar computation for the second bracketed term gives 



K = 



1 



d_ ( 1 dVG 
du \ J~E du 




(1.25) 



1.5. PROBLEM SET - SURFACES OF REVOLUTION. 



27 



as the expression for the Gaussian curvature in orthogonal coordinates. We 
shall give a more direct proof of this formula and of Gauss' theorema egregium 
once we develop the Cartan calculus. 



1.5 Problem set - Surfaces of revolution. 

The simplest (non-trivial) case is when n = 2 - the study of a curve in the plane. 
For the case of a curve X(t) — (x(t),y(t)) in the plane, we have 

X'(t) = (x'(t),y'(t)), N(t) = ^ {xW ^ yl{tW ^ -v\tUm 

where the ± reflects the two possible choices of normals. Equation (1.3) says 
that the one by one matrix L is given by 

ill = ~{N, X") = T ^—^(-y' X " + X 'y"). 

x' + y' 

The first fundamental form is the one by one matrix given by 

Qn = \\x'\\ 2 - 

So the curvature is 

' j(x"y'-y"x'). 



(x' 2 + y' 2 )i 

Verify that a straight line has curvature zero and that the curvature of a 
circle of radius r is ±l/r with the plus sign when the normal points outward. 

1. What does this formula reduce to in the case that x is used as a parameter, 
i.e. x(t) =t,y= f(x)l 

We want to study a surface in three space obtained by rotating a curve, 
7, in the x, z plane about the z— axis. Such a surface is called a surface of 
revolution. Surfaces of revolution form one of simplest yet very important 
classes of surfaces. The sphere, torus, paraboloid, ellipsoid with two equal axes 
are all surfaces of revolution. Because of modes of production going back to 
the potter's wheel, the surfaces of many objects of daily life are surfaces of 
revolution. We will find that the geometry of famous Schwarzschild black hole 
can be considered as a particular analogue of a surface of revolution in four 
dimensional space-time. 

Let us temporarily assume that the curve 7 is given by a function x = f(z) > 
so that we can use z, 9 as coordinates, where the surface is given by 

f f{z) COS0> 
X(z.O) = ( f(z)smO 
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and we choose the normal to point away from the z— axis. 

2. Find v{z,9) and show that the Weingarten map is diagonal in the X z ,Xg 
basis, in fact 



where k is the curvature of the curve 7 and where d is the distance of the normal 
vector, v, from the z— axis. Therefore the Gaussian curvature is given by 



Check that the Gaussian curvature of a cylinder vanishes and that of a sphere 
of radius i? is 1/R 2 . 

Notice that (1.26) makes sense even if we can't use z as a parameter every- 
where on 7. Indeed, suppose that 7 is a curve in the x, z plane that does not 
intersect the z— axis, and we construct the corresponding surface of revolution. 
At points where the tangent to 7 is horizontal (parallel to the x— axis) the nor- 
mal vector to the surface of revolution is vertical, so d = 0. Also the Gaussian 
curvature vanishes, since the Gauss map takes the entire circle of revolution into 
the north or south pole. So (1.26) is correct at these points. At all other points 
we can use z as a parameter. But we must watch the sign of k. Remember that 
the Gaussian curvature of a surface does not depend on the choice of normal 
vector, but the curvature of a curve in the plane does. In using (1.26) we must 
be sure that the sign of k is the one determined by the normal pointing away 
from the z— axis. 

3. For example, take 7 to be a circle of radius r centered at a point at distance 
D > r from the z— axis, say 



in terms of an angular parameter, <j>. The corresponding surface of revolution 
is a torus. Notice that in using (1.26) we have to take k as negative on the 
semicircle closer to the z— axis. So the Gaussian curvature is negative on the 
"inner" half of the torus and positive on the outer half. Using (1.26) and (f>,9 
as coordinates on the torus, express K as a function on 0, 9. Also, express the 
area element dA in terms of d<j>d9. Without any computation, show that the 
total integral of the curvature vanishes, i.e. J T KdA — 0. 

Recall our definitions of E,F,a,nd G given in equations (1.11)-(1.13). In the 
classical literature, one write the first fundamental form as 



N z = kX : 




x = D + r cos (j), 



z = 



r sin 



ds 2 = Edu 2 + 2Fdudv + Gdv 2 . 
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the meaning of this expression is as follows: let t i— > (u(t),v(t)) describe the 
curve 

C X(u{t),v{t)) 

on the surface. Then ds gives the clement of arc length of this curve if we 
substitute u — u(t),v — v(t) into the expression for the first fundamental form. 
So the first fundamental form describes the intrinsic metrical properties of the 
surface in terms of the local coordinates. Recall equation (1.25) which says that 
if u, v is an orthogonal coordinate system then the expression for the Gaussian 
curvature is 

d_(j_dVG\ d ( 1 dVE \ 

4. Show that the z, 9 coordinates introduced in problem 2 for a surface of 
revolution is an orthogonal coordinate system, find E and G and verify (??) for 
this case. 

A curve s i— > C(s) on a surface is called a geodesic if its acceleration, C", 
is everywhere orthogonal to the surface. Notice that 

±(C'(s),C'(s)) = 2(C"(s),C'(s)) 

and this = if C is a geodesic. The term geodesic refers to a parametrized curve 
and the above equation shows that the condition to be a geodesic implies that 
||C"(s)|| is a constant; i.e that the curve is parametrized by a constant multiple 
of arc length. If we use a different parameterization, say s = s(t) with dot 
denoting derivative with respect to t, then the chain rule implies that 

C = C's, C = C's 2 + C's. 

So if use a parameter other than arc length, the projection of the acceleration 
onto the surface is proportional to the tangent vector if C is a geodesic. In other 
words, the acceleration is in the plane spanned by the tangent vector to the curve 
and the normal vector to the surface. Conversely, suppose we start with a curve 
C which has the property that its acceleration lies in the plane spanned by the 
tangent vector to the curve and the normal vector to the surface at all points. 
Let us reparametrize this curve by arc length. Then (C (s) , C (s)) = 1 and 
hence (C" , C) = 0. As we are assuming that C lies in the plane spanned by C 
and the normal vector to the surface at each point of the curve, and that s is 
nowhere we conclude that C, in its arc length parametrization is a geodesic. 
Standard usage calls a curve which is a geodesic "up to reparametrization" a 
pregeodesic. I don't like this terminology but will live with it. 

5. Show that the curves 9 = const in the terminology of problem 2 are all 
pregeodsics. Show that the curves z = const, are pregeodesics if and only if z 
is a critical point of /, (i.e. f'(s) = 0). 
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The general setting for the concept of surfaces of revolution is that of a 
Ricmannian submersion, which will be the subject of Chapter 8. 



Chapter 2 

Rules of calculus. 

2.1 Superalgebras. 

A (commutative associative) superalgebra is a vector space 

A = A even (B J^odd 

with a given direct sum decomposition into even and odd pieces, and a map 

Ax A—> A 

which is bilinear, satisfies the associative law for multiplication, and 



^enen X A even 


^ ^enen 


A even X A^odd 




■^odd X A even 




Aodd x A-odd 


^ ^enen 



a; • <r = cr • w if either uora are even, 
uj ■ a = —a ■ uo if both ui and a are odd. 

We write these last two conditions as 

w • <7 = (_i)deg CT deg W(T . ^_ 

Here deg r = if t is even, and deg r = 1 (mod 2) if r is odd. 

2.2 Differential forms. 

A linear differential form on a manifold, M, is a rule which assigns to each 
p G M a linear function on TM p . So a linear differential form, w, assigns to 
each p an element of TM*. We will, as usual, only consider linear differential 
forms which arc smooth. 
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The superalgebra, f2(M) is the superalgebra generated by smooth functions 
on M (taken as even) and by the linear differential forms, taken as odd. 

Multiplication of differential forms is usually denoted by A. The number of 
differential factors is called the degree of the form. So functions have degree 
zero, linear differential forms have degree one. 

In terms of local coordinates, the most general linear differential form has 
an expression as a\dx\ + • • • + a n dx n (where the dj are functions). Expressions 
of the form 

a\2<ix\ A dx 2 + a\sdx\ A dx 3 + • • • + a n -\ tn dx n -i A dx n 
have degree two (and are even). Notice that the multiplication rules require 

dxi A dxj = —dxj A dxi 

and, in particular, dxi A dxi = 0. So the most general sum of products of two 
linear differential forms is a differential form of degree two, and can be brought 
to the above form, locally, after collections of coefficients. Similarly, the most 
general differential form of degree k < n in n dimensional manifold is a sum, 
locally, with function coefficients, of expressions of the form 

dx il A • • • A dx ik , ii < ■ ■ ■ < i k . 
such expressions, and they are all even, if k is even, and odd 



There are 
if k is odd. 



2.3 The d operator. 

There is a linear operator d acting on differential forms called exterior differ- 
entiation, which is completely determined by the following rules: It satisfies 
Leibniz' rule in the "super" form 

d(w • a) = (dcu) ■ a+ (-l) de g- u ■ (da). 
On functions it is given by 

d f d f 

df = -T^dxi H h -r — dx n 

ax i ox n 

and, finally, 

d(dxi) = 0. 

Since functions and the dxi generate, this determines d completely. For example, 
on linear differential forms 

u> = a\dx\ + ■ ■ ■ a n dx n 
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we have 



du — dai A dxi + ■ ■ ■ + da n A dx n 



dai 



dxi + 



dai 

dx r . 



dx n I A dxi + 



^^dxi H h ^-dx n I A dx n 

ox i ax n 



da-i dai \ 
dxi 8x2 J 



dx\ A dx2 + • 



da„. 



da Tl 



dx n . 



dx n 



dx n -i A dx r] 



In particular, equality of mixed derivatives shows that d 2 f = 0, and hence that 
d 2 uj = for any differential form. Hence the rules to remember about d are: 



d(uj ■ a) 
d 2 

df 



(duj) -a + (-l) dc ^ w (da) 


df J df 



dxi 



dx n . 



dxi dx n 

2.4 Derivations. 

A linear operator I : A — > A is called an odd derivation if, like d, it satisfies 
£ • A even > Aodd, £ . A oc id > A even 



£(to ■ a) = {tu) ■ a + (-l) de S' 
-A, 

P ■ A 



Aeven-, I ' A dd * A dd 



and 

A linear map I : A 
satisfying 

e(u ■ a) = (£lu) -a + uj- (la) 
is called an even derivation. So the Leibniz rule for derivations, even or odd, is 

1(cj ■ a) = (M • a + (-ijdegftiegu, w . l(J 

Knowing the action of a derivation on a set of generators of a superalgebra 
determines it completely. For example, the equations 



implies that 



d(x{) = dxi, d(dxi) = Vi 
dp 



dp = Tr—dxi + 
ox l 



f J^dx n 

ox n 



for any polynomial, and hence determines the value of d on any differential form 
with polynomial coefficients. The local formula we gave for df where / is any 
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diffcrcntiable function, was just the natural extension (by continuity, if you like) 
of the above formula for polynomials. 

The sum of two even derivations is an even derivation, and the sum of two 
odd derivations is an odd derivation. 

The composition of two derivations will not, in general, be a derivation, but 
an instructive computation from the definitions shows that the commutator 

[iiM :=iio4-(-i) deg ' ldeg£2 e 2 oe 1 

is again a derivation which is even if both are even or both are odd, and odd if 
one is even and the other odd. 

A derivation followed by a multiplication is again a derivation: specifically, 
let i be a derivation (even or odd) and let t be an even or odd element of A. 
Consider the map 

lo i ► tIlo. 

We have 

Tt(ua) = (t£lo) ■ a + (_i)deg<deg* TU . la 

= (tIuj) ■ a + (_i)(deg* + degT)deg^ . (rfo) 

so lo i ► t£lo is a derivation whose degree is 

degr + deg£. 

2.5 Pullback. 

Let (f> : M — > N be a smooth map. Then the pullback map <p* is a linear map 
that sends differential forms on N to differential forms on M and satisfies 

4>*(ujAo-) = 4>* : lo A 4>* a 
<p*du — d(j)*u) 

(07) = /o0. 

The first two equations imply that <p* is completely determined by what it 
does on functions. The last equation says that on functions, <p* is given by 
"substitution" : In terms of local coordinates on M and on N <f> is given by 

4{x\...,x m ) = (y\...,y n ) 

y l = <i> i {x\...,x m ) i = l,...,n 

where the <pi are smooth functions. The local expression for the pullback of a 
function f{y 1 1 . . . ,y n ) is to substitute <f> 1 for the y l s as into the expression for / 
so as to obtain a function of the x's. 

It is important to observe that the pull back on differential forms is de- 
fined for any smooth map, not merely for diffcomorphisms. This is the great 
advantage of the calculus of differential forms. 
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2.6 Chain rule. 

Suppose that tp : N — > P is a smooth map so that the composition 

(p o V> : M -> P 
is again smooth. Then the chain rule says 

(00^)* = V*o0*. 

On functions this is essentially a tautology - it is the associativity of composition: 
/ o ((f) o ip) — (/ o <f>) o i/j. But since pull-back is completely determined by what 
it does on functions, the chain rule applies to differential forms of any degree. 



2.7 Lie derivative. 

Let 4> t be a one parameter group of transformations of M. If uj is a differential 
form, we get a family of differential forms, <jf t ui depending diffcrentiably on t, 
and so we can take the derivative at t = 0: 

j t (<P*t")\t=o = Km * [#w - u] . 
Since 0£ (w A a) = (p^u A it follows from the Leibniz argument that 

is an even derivation. We want a formula for this derivation. 

Notice that since <j>ld= d(f>1 for all t, it follows by differentiation that 

e^d = di^ 

and hence the formula for 1$ is completely determined by how it acts on func- 
tions. 

Let X be the vector field generating <j> t . Recall that the geometrical signifi- 
cance of this vector field is as follows: If we fix a point x, then 

1 1 ^ 4> t (x) 

is a curve which passes through the point x at t = 0. The tangent to this curve 
at t = is the vector X(x). In terms of local coordinates, X has coordinates 
X = (X 1 , . . . , X n ) where X l (x) is the derivative of 4> l (t, x\ . . . , x n ) with respect 
to t at t = 0. The chain rule then gives, for any function /, 

V - j t f{4>\t,x\...,x n ),...,cS> n {t,x\...,x n )\ t=Q 

— A — 1 + A — . 

axi ax n 
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For this reason we use the notation 

ox i ax n 

so that the differential operator 

/ ~ Xf 

gives the action of 1$ on functions. 

As we mentioned, this action of t$ on functions determines it completely. In 
particular, 1$ depends only on the vector field X, so we may write 

U = Lx 

where L x is the even derivation determined by 

L x f = Xf, L x d = dL x . 



2.8 Weil's formula. 

But we want a more explicit formula L x . For this it is useful to introduce an 
odd derivation associated to X called the interior product and denoted by i(X). 
It is defined as follows: First consider the case where 

^ dxj 



and define its interior product by 



d 
dxj 



f = 



for all functions while 



and 



dxj 



dx k = 0, k^ j 



d 

dxj 



dxj = 1. 



The fact that it is a derivation then gives an easy rule for calculating i(d/dxj) 
when applied to any differential form: Write the differential form as 

w + dxj A a 

where the expressions for ui and a do not involve dxj. Then 



d 

dxj 



[lu + dxj A a] 



2.8. WEIL'S FORMULA. 
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The operator 

XH 



dx 3 



which means first apply i{d/dxj) and then multiply by the function X-? is again 
an odd derivation, and so we can make the definition 

« x » : = Jf ' i (^) + "- +x " i (^:)' <"> 

It is easy to check that this does not depend on the local coordinate system 
used. 

Notice that we can write 



Xf = i{X)df. 

In particular we have 

Lxdxj — dLxXj 
= dX j 
= di(X)dxj. 



We can combine these two formulas as follows: Since i(X)f = for any function 
/ we have 

L x f = di(X)f + i(X)df. 



Since ddxj — we have 



Lxdxj = di(X)dxj + i(X)ddxj. 

Hence 

L x = di(X) + i(X)d=[d,i(X)] (2.2) 

when applied to functions or to the forms dxj . But the right hand side of the 
preceding equation is an even derivation, being the commutator of two odd 
derivations. So if the left and right hand side agree on functions and on the 
differential forms dxj they agree everywhere. This equation, (2.2), known as 
Weil's formula, is a basic formula in differential calculus. 

We can use the interior product to consider differential forms of degree k as 
fc— multilinear functions on the tangent space at each point. To illustrate, let 
a be a differential form of degree two. Then for any vector field, X, i(X)a is 
a linear differential form, and hence can be evaluated on any vector field, Y to 
produce a function. So we define 



<j(X,Y) := [i(X)a] (Y). 
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We can use this to express exterior derivative in terms of ordinary derivative 
and Lie bracket: If 9 is a linear differential form, we have 

dO(X, Y) = [i(X)d6] (Y) 

i(X)d9 = L x 6-d{i{X)6) 

d(i(X)6)(Y) - Y[6(X)] 

[L X 6](Y) = L X [6(Y)]-8(L X (Y)) 

= X[6(Y)}-9([X,Y}) 

where we have introduced the notation L X Y =: [X, Y] which is legitimate since 
on functions we have 

(L x Y)f = L x (Yf) - YL x f = X(Yf) - Y(Xf) 

so L X Y as an operator on functions is exactly the commutator of X and Y. 
(See below for a more detailed geometrical interpretation of L X Y.) Putting the 
previous pieces together gives 

d9(X, Y) = X6(Y) - Y6(X) - 9([X, Y]), (2.3) 

with similar expressions for differential forms of higher degree. 

2.9 Integration. 

Let 

lo = fdx\ A • • • A dx n 

be a form of degree n on R". (Recall that the most general differential form of 
degree n is an expression of this type.) Then its integral is defined by 

/ lo := / fdxi ■ ■ ■ dx n 

JM JM 

where M is any (measurable) subset. This, of course is subject to the condition 
that the right hand side converges if M is unbounded. There is a lot of hidden 
subtlety built into this definition having to do with the notion of orientation. 
But for the moment this is a good working definition. 

The change of variables formula says that if 4> '■ M — > R™ is a smooth 
diffcrentiable map which is one to one whose Jacobian determinant is everywhere 
positive, then 

JM JM 



LO. 

IM J(p(M) 



2.10 Stokes theorem. 

Let U be a region in R" with a chosen orientation and smooth boundary. We 
then orient the boundary according to the rule that an outward pointing normal 
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vector, together with the a positive frame on the boundary give a positive frame 
in R™. If a is an (n — 1)— form, then 




A manifold is called orientable if we can choose an atlas consisting of charts 
such that the Jacobian of the transition maps cf> a o(j)'^ 1 is always positive. Such a 
choice of an atlas is called an orientation. (Not all manifolds are orientable.) If 
we have chosen an orientation, then relative to the charts of our orientation, the 
transition laws for an n— form (where n = dim M) and for a density are the same. 
In other words, given an orientation, we can identify densities with n— forms 
and n— form with densities. Thus we may integrate n— forms. The change of 
variables formula then holds for orientation preserving diffcomorphisms as does 
Stokes theorem. 



2.11 Lie derivatives of vector fields. 

Let Y be a vector field and <f> t a one parameter group of transformations whose 
"infinitesimal generator" is some other vector field X. We can consider the 
"pulled back" vector field <$>%Y defined by 

$Y(x) = d^ t {Y{4> t x)}. 

In words, we evaluate the vector field Y at the point 4>t(x), obtaining a tangent 
vector at <pt{x), and then apply the differential of the (inverse) map 0_ t to 
obtain a tangent vector at x. 

If we differentiate the one parameter family of vector fields §* t Y with respect 
to t and set t = we get a vector field which we denote by LxY: 

L X Y := |^ . 

If lo is a linear differential form, then we may compute i(Y)u which is a 
function whose value at any point is obtained by evaluating the linear function 
oj(x) on the tangent vector Y(x). Thus 

i(#r)#o;(aO = (d^M^x) , d^ t Y (4> t x)) = {i{Y)uj}{<j> t x). 

In other words, 

We have verified this when a; is a differential form of degree one. It is trivially 
true when uo is a differential form of degree zero, i.e. a function, since then both 
sides are zero. But then, by the derivation property, we conclude that it is true 
for forms of all degrees. We may rewrite the result in shorthand form as 



#o»(y) = i(#y)o#. 
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Since <\>* t d= d(j>l we conclude from Weil's formula that 

<f>t °L Y = L rt y o <p* t . 

Until now the subscript t was superfluous, the formulas being true for any fixed 
diffcomorphism. Now we differentiate the preceding equations with respect to t 
and set t = 0. We obtain, using Leibniz's rule, 

L x oi{Y)=i{L x Y) + i{Y)oL x 

and 

Lx o Ly = Ll x y + Ly ° L X - 

This last equation says that Lie derivative (on forms) with respect to the vector 
field L X Y is just the commutator of L x with L Y - 

Ll x y = [Lx,L Y ]- 

For this reason we write 

[X,Y] := L X Y 

and call it the Lie bracket (or commutator) of the two vector fields X and Y. 
The equation for interior product can then be written as 

i([X,Y]) = [L x ,i(Y)]. 

The Lie bracket is antisymmetric in X and Y. We may multiply Y by a function 
g to obtain a new vector field gY. Form the definitions we have 

ti(gY) = (<p* t gWY 

Differentiating at t = and using Leibniz's rule we get 

[X,gY} = (Xg)Y + g[X,Y] (2.4) 

where we use the alternative notation Xg for L x g. The antisymmetry then 
implies that for any differentiable function / we have 

[fX,Y] = -(Yf)X + f[X,Y\. (2.5) 

From both this equation and from Weil's formula (applied to differential forms 
of degree greater than zero) we see that the Lie derivative with respect to X at 
a point x depends on more than the value of the vector field X at x. 

2.12 Jacobi's identity. 

From the fact that [X, Y] acts as the commutator of X and Y it follows that 
for any three vector fields X, Y and Z we have 



[X, [Y, Z\\ + [Z, [X, Y]] + [Y, [Z, X}} = 0. 
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This is known as Jacobi's identity. We can also derive it from the fact that 
[Y, Z] is a natural operation and hence for any one parameter group <p t of dif- 
feomorphisms we have 

#([K,z]) = [#y,#z]. 

If X is the infinitesimal generator of <j>t then differentiating the preceding equa- 
tion with respect to t at t = gives 

[X, [Y,Z]] = [[X,Y],Z] + [Y,[X,Z]]. 

In other words, X acts as a derivation of the "mutliplication" given by Lie 
bracket. This is just Jacobi's identity when we use the antisymmetry of the 
bracket. In the future we we will have occasion to take cyclic sums such as 
those which arise on the left of Jacobi's identity. So if F is a function of three 
vector fields (or of three elements of any set) with values in some vector space 
(for example in the space of vector fields) we will define the cyclic sum Cyc F 

by 

Cyc F{X, Y, Z) := F(X, Y, Z) + F(Y, Z, X) + F(Z, X, Y). 
With this definition Jacobi's identity becomes 

Cyc [X, [Y, Z}] = 0. (2.6) 

Exercises 

2.13 Left invariant forms. 

Let G be a group and M be a set. A left action of G on M consists of a map 

4>:GxM^M 

satisfying the conditions 

4>(a, 4>(b, m))) = 4>{ab,m) 

(an associativity law) and 

<f>(e, to) = to, Vm € M 

where e is the identity clement of the group. When there is no risk of confusion 
we will write am for (j>(a,m). (But in much of the beginning of the following 
exercises there will be a risk of confusion since there will be several different 
actions of the same group G on the set M). We think of an action as assigning 
to each element a € G a transformation, <f> a , of M: 

4> a : M — > M, (j) a : m i— > </>(a, to). 

So we also use the notation 



4> a m = <j>(a,m). 
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For example, we may take M to be the group G itself and let the action be left 
multiplication, L, so 

L(a, m) = am. 

We will write 

L a : G — > G, L a m — am. 
We may may also consider the (left) action of right multiplication: 

R:GxG->G, R(a,m) = ma^ 1 . 

(The inverse is needed to get the order right in R(a, R(b,m)) — R(ab,m).) So 
we will write 

R a : G — > G, R a m = maT 1 . 

We will be interested in the case that G is a Lie group, which means that 
G is a manifold and the multiplication map G x G — > G and the inverse map 
G — > G, (2 i ► a -1 are both smooth maps. Then the differential, (dL a ) m maps 
the tangent space to G at m, to the tangent space to G at am: 

dL a : TG m — > TG am 

and similarly 

<iR a : TG m — * TG ma . 

In particular, 

dL a -i : TG a — > TG e . 

Let G = Gl(n) be the group of all invcrtiblc n x n matrices. It is an open 
subset (hence a submanifold) of the n 2 dimensional space Mat(n) of all n x n 
matrices. We can think of the tautological map which sends every A e G into 
itself thought of as an clement of Mat(n) as a matrix valued function on G. Put 
another way, A is a matrix of functions on G, each of the matrix entries A^ of 
A is a function on G. Hence dA = (dAij) is a matrix of differential forms (or, 
we may say, a matrix valued differential form) . So we may consider 

A~ x dA 

which is also a matrix valued differential form on G. Let B be a fixed element 
of G. 

1. Show that 

L* B (A~ 1 dA) = A- x dA. (2.7) 
So each of the entries of A~ x dA is left invariant. 

2. Show that 

R* B {A~ 1 dA) = B(A~ 1 dA)B~ 1 . (2.8) 

So the entries of A~ x dA are not right invariant (in general), but (2.8) shows 
how they are transformed into one another by right multiplication. 
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For any two matrix valued differential forms R — (Rij) and S — (Sij) define 
their matrix exterior product RAS by the usual formula for matrix product, but 
with exterior multiplication of the entries instead of ordinary multiplication, so 

(R A S)ik ■= ^2 ^3 ^ Sjk- 

3 

Also, if R = (Rij) is a matrix valued differential form, define dR by applying d 
to each of the entries. So 

[dR)ij := (dRij). 

Finally, if tp : X — > Y is a smooth map and R = (Rij) is a matrix valued form 
on Y then we define its pullback by pulling back each of the entries: 

2.14 The Maurer Cart an equations. 

3. In elementary calculus we have the formula d(l/x) = —dx/x 2 . What is the 
generalization of this formula for the matrix function A~ l . In other words, what 
is the formula for d(A _1 )l 

4. Show that if we set ui = A~ x dA then 

dw + wAw = 0. (2.9) 

Here is another way of thinking about A~ 1 dA: Since G = Gl(n) is an open 
subset of the vector space Mat(n), we may identify the tangent space TGa with 
the vector space Mat(n). That is we have an isomorphism between TGa and 
Mat(n). If you think about it for a minute, it is the form dA which effects this 
isomorphism at every point. On the other hand, left multiplication by A^ 1 is 
a linear map. Under this identification, the differential of a linear map L looks 
just like L. So in terms of this identification, A~ x dA, when evaluated at the 
tangent space TGa is just the isomorphism dh~ A x : TGa — * TGi where I is the 
identity matrix. 

2.15 Restriction to a subgroup 

Let H be a Lie subgroup of G. This means that H is a subgroup of G and it is 
also a submanifold. In other words we have an embedding 



l:H^G 
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which is a(n injective) group homomorphism.Let 



t) = THj 



denote the tangent space to H at the identity element. 

5. Conclude from the preceding discussion that if we now set 

u = L*{A- 1 dA) 

then io takes values in t). In other words, when we evaluate lu on any tangent 
vector at any point of H we get a matrix belonging to the subspace f). 

6. Show that on a group, the only transformations which commute with all the 
right multiplications, i?^, b e G, are the left multiplications, L a . 

For any vector £ e THj, define the vector field X by 



(Recall that is right multiplication by A and so sends / into A.) For 

example, if we take H to be the full group G — Gl(n) and identify the tangent 
space at every point with Mat(n) then the above definition becomes 



By construction, the vector field X is right invariant, i.e. is invariant under all 
the diffeomorphisms Rb- 

7. Conclude that the flow generated by X is left multiplication by a one param- 
eter subgroup. Also conclude that in the case H = Gl(n) the flow generated by 
X is left multiplication by the one parameter group 



Finally conclude that for a general subgroup H, if £ G t) then all the expi£ lie 
in H. 

8. What is the space h in the case that H is the group of Euclidean motions 
in three dimensional space, thought of as the set of all four by four matrices of 
the form 



X{A) = dR A -i£. 



X(A) - £A. 



exp i£. 




2.16 Frames. 



Let V be an n dimensional vector space. Recall that frame on V is, by defini- 
tion, an isomorphism f : R n — > V. Giving f is the same as giving each of the 
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vectors fi — f(<5,) where the Si range over the standard basis of R™. So giving a 
frame is the same as giving an ordered basis of V and we will sometimes write 

f =(/l,...,/n). 

If A £ Gl(n) then A is an isomorphism of R" with itself, so f o A^ 1 is another 
frame. So we get an action, R : Gl(n) xF^F where F = F(V) denotes the 
space of all frames: 

R(A,f) = f o A' 1 . (2.10) 

If f and g are two frames, then g _1 o f = M is an isomorphism of R n with itself, 
i.e. a matrix. So given any two frames, f and g, there is a unique M £ Gl(n) 
so that g = f o M^ 1 . Once we fix an f, we can use this fact to identify F with 
Gl(n), but the identification depends on the choice of f. But in any event the 
(non-unique) identification shows that F is a manifold and that (2.10) defines 
an action of Gl(n) on F. Each of the /, (the i— th basis vector in the frame) 
can be thought of as a V valued function on F. So we may write 

dfj = J2^fi ( 2 - U ) 

where the Wij are ordinary (number valued) linear differential forms on F. We 
think of this equation as giving the expansion of an infinitesimal change in fj 
in terms of the basis f = (/i, . . . , /„). If we use the "row" representation of f 
as above, we can write these equations as 

df = fuj (2.12) 

where u> — (u>ij)- 

9. Show that the ui defined by (2.12) satisfies 

R* b lu = BluB- 1 . (2.13) 



To see the relation with what went on before, notice that we could take 
V = R" itself. Then f is just an invertible matrix, A and (2.12) becomes our 
old equation w = A~ 1 dA. So (2.13) reduces to (2.8). 

If we take the exterior derivative of (2.12) we get 

= d(df) = df A us + idw = f (oj A u + duj) 
from which we conclude 

div + uj Auj = 0. (2.14) 

2.17 Euclidean frames. 

We specialize to the case where V = R™, n = d + 1 so that the set of frames 
becomes identified with the group Gl(n) and restrict to the subgroup, H, of 
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Euclidean motions which consist of all n x m matrices of the form 

1 ), A e 0(d), veK d . 

Such a matrix, when applied to a vector 

w 
1 

sends it into the vector 

Aw + v 
1 

and Aw + v is the orthogonal transformation A applied to w followed by the 
translation by v. The corresponding Euclidean frames (consisting of the columns 
of the elements of H) are thus defined to be the frames of the form 

fi = ^ q ^ ! i = l,...d, 
where the e, form an orthonormal basis of R d and 



fa - ( 1 

where v G R d is an arbitrary vector. The idea is that v represents a choice of 
origin in d dimensional space and e = (e l7 . . . , e^) is an orthonormal basis. We 
can write this in shorthand notation as 



f 



e v 
1 



If l denotes the embedding of H into G, we know from the exercise 5 that 

n e 





where 

So the pull back of (2.12) becomes 

-(;;)=(") <>■*> 

or, in more expanded notation, 

dej — ^ fijj-ei, ^ w = ^ ^i e i- 
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Let ( , ) denote the Euclidean scalar product. Then we can write 

O i = {dv,e i ) (2.16) 



and 

If we set 

this becomes 

Then (2.14) becomes 



e = -n 

{de i ,e j ) = Q ij . (2.17) 



de = eA8, de = e a e. (2.18) 

Or, in more expanded notation, 

d6i = ®H A V d ®rk = Q V A ( 2 - 19 ) 



Equations (2.16)-(2.18) or (2.19) are known as the structure equations of 
Euclidean geometry. 



2.18 Frames adapted to a submanifold. 

Let M be a k dimensional submanifold of R d . This determines a submanifold 
of the manifold, H, of all Euclidean frames by the following requirements: 

i) v e M and 

ii) e.i e TM V for i < k. We will usually write m instead of v to emphasize 
the first requirement - that the frames be based at points of M. The second 
requirement says that the first k vectors in the frame based at m be tangent to 
M (and hence that the last n — k vectors in the frame are normal to M). We 
will denote this manifold by O(M). It has dimension 

k+ k(k-l) + (d-k-l)(d-k) 

The first term comes from the point m varying on M, the second is the dimension 
of the orthogonal group O(k) corresponding to the choices of the first k vectors 
in the frame, and the third term is dim 0(d — k) correspond to the last (n — k) 
vectors. We have an embedding of O(M) into H, and hence the forms 9 and 6 
pull back to O(M). As we are running out of letters, we will continue to denote 
these pull backs by the same letters. So the pulled back forms satisfy the same 
structure equations (2.16)-(2.18) or (2.19) as above, but they are supplemented 

by 

6i = 0, Vi > k. (2.20) 
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2.19 Curves and surfaces - their structure equa- 
tions. 

We will be particularly interested in curves and surfaces in three dimensional 
Euclidean space. For a curve, C, the manifold of frames is two dimensional, and 
we have 

dC = 0id (2.21) 
dei - ei 2 e 2 + ei3e 3 (2.22) 
de 2 = e 2 ie 1 +e 2 3e3 (2.23) 
de 3 = e 3iei +e 32 e 2 . (2.24) 

One can visualize the manifold of frames as a sort of tube: about each point of 
the curve there is a circle in the plane normal to the tangent line corresponding 
the possible choices of e 2 . 

For the case of a surface the manifold of frames is three dimensional: we can 
think of it as a union of circles each centered at a point of S and in the plane 
tangent to S at that point. Then equation (2.21) is replaced by 

dX = 6>iei + 6 2 e 2 (2.25) 

but otherwise the equations are as above, including the structure equations 
(2.19). These become 



d6 l 


= e 12 a 2 


(2.26) 


d6 2 


= -0i2 A 0i 


(2.27) 





= ©31 A 01 + 6 3 2 A 2 


(2.28) 


de 12 


= ©13 A ©32 


(2.29) 


dQi 3 


= ©12 A ©23 


(2.30) 


dQ 23 


= ©21 A ©13 


(2.31) 



Equation (2.29) is known as Gauss' equation, and equations (2.30) and (2.31) 
are known as the Codazzi-Mainardi equations. 



2.20 The sphere as an example. 

In computations with local coordinates, we may find it convenient to use a 
"cross-section" of the manifold of frames, that is a map which assigns to each 
point of neighborhood on the surface a preferred frame. If we are given a 
parametrization m = m{u, v) of the surface, one way of choosing such a cross- 
section is to apply the Gram-Schmidt orthogonalization procedure to the tan- 
gent vector fields m u and m v , and take into account the chosen orientation. 

For example, consider the sphere of radius R. We can parameterize the 
sphere with the north and south poles (and one longitudinal semi-circle) removed 



2.20. THE SPHERE AS AN EXAMPLE. 



49 



by the (u, v) e (0, 2ir) x (0, tt) by X = X(u, v) where 

(R cos u sin v 
R sin u sin i> 
i?cosw 

Here v denotes the angular distance from the north pole, so the excluded value 
v = corresponds to the north pole and the excluded value v — tt corresponds 
to the south pole. Each constant value of v between and tt is a circle of latitude 
with the equator given by v — \. The parameter u describes the longitude from 
the excluded semi-circle. 

In any frame adapted to a surface in R 3 , the third vector e% is normal to 
the surface at the base point of the frame. There are two such choices at each 
base point. In our sphere example let us choose the outward pointing normal, 
which at the point m(u, v) is 

(cosu sin v^ 
sin u sin v 
cosv 

We will write the left hand side of this equation as e 3 (u,w). The coordinates 
u, v are orthogonal, i.e. X u and X v are orthogonal at every point, so the or- 
thonormalization procedure amounts only to normalization: Replace each of 
these vectors by the unit vectors pointing in the same direction at each point. 
So we get 

(— sin u\ I cos u cos v x 

cosw , e^{u, v) = sin u cosv 
J \ — sinw 

We thus obtain a map tp from (0, 2n) x (0, tt) to the manifold of frames, 
ip{u,v) = (X(u,v),e 1 (u,v),e 2 ,{u,v),e 3 (u,v)). 

Since X u ■ e\ = Rs'mv and X v ■ e 2 = R we have 

dX(u, v) = (i?sin vdu)ei(u, v) + (Rdv)e2{u, v). 

Thus we see from (2.25) that 

tp*0\ = Rsinvdu, ip*9 2 — Rdv 

and hence that 

ip*{9 x A 9 2 ) = R 2 smvdu A dv. 

Now R 2 sin vdudv is just the area element of the sphere expressed in u, v co- 
ordinates. The choice of ei 7 e 2 determines an orientation of the tangent space 
to the sphere at the point X(u,v) and so ip*(6\ A 9 2 ) is the pull-back of the 
corresponding oriented area form. 



50 



CHAPTER 2. RULES OF CALCULUS. 



10. Compute V*@i2, ip*®is, and ip*Q 2 3 and verify that 

where K = 1/R 2 is the curvature of the sphere. 

We will generalize this equation to an arbitrary surface in R 3 in section ??. 

2.21 Ribbons 

The idea here is to study a curve on a surface, or rather a curve with an 
"infinitesimal" neighborhood of a surface along it. So let C be a curve and 
0(C) its associated two dimensional manifold of frames. We have a projection 
7r : 0(C) — » C sending every frame into its origin. By a ribbon based on C we 
mean a section n : C — > O(C), so n assigns a unique frame to each point of the 
curve in a smooth way. We will only be considering curves with non-vanishing 
tangent vector everywhere. With no loss of generality we may assume that we 
have parametrized the curve by arc length, and the choice of e\ determines an 
orientation of the curve, so 8 = ds. The choice of e 2 at every point then de- 
termines e 3 up to a ± sign. So a good way to visualize s is to think of a rigid 
metal ribbon determined by the curve and the vectors e 2 perpendicular to the 
curve (determined by n) at each point. The forms Qij all pull back under n to 
function multiples of ds: 

n*6i2 = kds, n*Q 23 = —rds, n*Qi 3 = w ds (2.32) 

can write equations (2.21)- (2.24) above 

= ei, 



aei , ae 2 ae 3 

—— = ke 2 + we 3 , —— = -ke 1 -Te 3 , —— = -we 1 + re 3 . (2.33) 
ds ds ds 

For later applications we will sometimes be sloppy and write O^- instead of 
n*@ij for the pull back to the curve, so along the ribbon we have 612 = kds etc. 
Also it will sometimes be convenient in computations (as opposed to proving 
theorems) to use parameters other than arc length. 

11. Show that two ribbons (defined over the same interval of s values) are 
congruent (that is there is a Euclidean motion carrying one into the other) if 
and only if the functions k, r, and w are the same. 

A ribbon is really just a curve in the space, H, of all Euclidean frames, having 
the property that the base point, that is the v of the frame (v, e\, e 2 , e 3 ) has non- 
vanishing derivative. The previous exercise says that two curves, i : I — > H and 
j : I — > H in H differ by an overall left translation (that is satisfy j = Lf t o i)if 
and only if the forms 0, 612, 613, 623 pull back to the same forms on I. The 



where k, r and w are functions of s. We 
as 

dC 
ds 
and 
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form i*0 is just the arc length form ds as we mentioned above. It is absolutely 
crucial for the rest of this course to understand the meaning of the form i*Oi2- 
Consider a circle of latitude on a sphere of radius R. To fix the notation, 
suppose that the circle is at angular distance v from the north pole and that 
we use u as angular coordinates along the circle. Take the ribbon adapted to 
the sphere, so e\ is the unit tangent vector to the circle of latitude and ei is the 
unit tangent vector to the circle of longitude chosen as above. Problem 10 then 
implies that «*Oi2 = —cosvdu. 

12. Let C be a straight line (say a piece of the z-axis) parametrized according 
to arc length and let e 2 be rotating at a rate f(s) about C (so, for example, 
e 2 = cos/(s)i + sin/(s)j where i and j are the unit vectors in the x and y 
directions). What is i*0i2? 

To continue our understanding of O12, let us consider what it means for two 
ribbons, i : I — > H and j : I — > H to have the same value of the pullback of 612 
at some point so € / (where I is some interval on the real line). So 

(i*Ql2)\s=s = Cf ©12)| S = S „- 

There is a (unique) left multiplication, that is a unique Euclidean motion, which 
carries i(so) to j(s ). Let assume that we have applied this motion so we assume 
that i(so) = j(so). Let us write 

i(s) = (C(s),e 1 ( S ),e 2 ( S ),e 3 (s)), j(s) = (D(s), fas), / 2 (s)./ 3 (s)) 

and we are assuming that C(s ) = D(s a ), C'(s a ) = ei(so) = A( s o) = D'(so) 
so the curves C and D are tangent at s , and that e 2 (so) = /2( s o) so that the 
planes of the ribbon (spanned by the first two orthonormal vectors) coincide. 
Then our condition about the equality of the pullbacks of 612 asserts that 

((4-/ 2 )(so),ei(so)) = 

and of course ((e 2 — /2)( s o)j e 2 (so)) = automatically since e2(s) and / 2 (s) 
are unit vectors. So the condition is that the relative change of e 2 and / 2 (and 
similarly e\ and j\ ) at So be normal to the common tangent plane to the ribbon. 

2.22 Developing a ribbon. 

We will now drop one dimension, and consider ribbons in the plane (or, if you 
like, ribbons lying in a fixed plane in three dimensional space). So all we have 
is 9 and 612 ■ Also, the orientation of the curve and of the plane completely 
determines e 2 as the unit vector in the plane perpendicular to the curve and 
such that ei, e 2 give the correct orientation, so a ribbon in the plane is the same 
as an oriented curve. 

13. Let k = k(s) be any continuous function of s. Show that there is a ribbon 
in the plane whose base curve is parametrized by arc length and for which 
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j* ©12 = kds. Furthermore, show that this planar ribbon (curve) is uniquely 
determined up to a planar Euclidean motion. 

It follows from the preceding exercise, that we have a way of associating 
a curve in the plane (determined up to a planar Euclidean motion) to any 
ribbon in space. It consists of rocking and rolling the ribbon along the plane in 
such a way that infinitesimal change in the e\ and e 2 are always normal to the 
plane. Mathematically, it consists in solving problem 13 for the k — k(s) where 
i*Q 12 = kds for the ribbon. We call this operation developing the ribbon onto a 
plane. In particular, if we have a curve on a surface, we can consider the ribbon 
along the curve induced by the surface. In this way, we may talk of developing 
the surface on a plane along the given curve. Intuitively, if the surface were 
convex, this amounts to rolling the surface on a plane along the curve. 

noindentl4. What are results of developing the ribbons of Problem 12 and 
the ribbon we associated to a circle of latitude on the sphere? 

2.23 Parallel transport along a ribbon. 

Recall that a ribbon is a curve in the space, H, of all Euclidean frames, having 
the property that the base point, that is the C of the frame (C, ei,e 2 ,e 3 ) has 
non- vanishing derivative at all points. So C defines a curve in Euclidean three 
space with nowhere vanishing tangent. We will parameterize this curve (and 
the ribbon) by arc length. By a unit vector field tangent to the ribbon we will 
mean a curve, v(s) of unit vectors everywhere tangent to the ribbon, so 

v(s) = cos a(s) ei(s) + sina(s) e 2 (s). (2.34) 

We say that the vector field is parallel along the ribbon if the infinitesimal change 
in v is always normal to the ribbon, i.e. if 

(v'(s),e 1 (s)) = (v'(s),e 2 (s))=0. 

Recall the form 612 = kds from before. 

15. Show that the vector field as given above is parallel if and only if the 
function a satisfies the differential equation 

a + k = 0. 

Conclude that the notion of parallelism depends only on the form Oi2- Also 
conclude that given any unit vector, v a at some point s , there is a unique 
parallel vector field taking on the value vo at s . The value v(s\) at some 
second point is called the parallel transport of Vq (along the ribbon) from s to 
si- 
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16. What is the condition on a ribbon that the tangent vector to the curve 
itself, i.e. the vector field ei, be parallel? Which circles on the sphere are such 
that the associated ribbon has this property? 

Suppose the ribbon is closed, i.e. C(s + L) = C(s), ei(s + L) = ei(s), e 2 (s + 
L) = e 2 (s) for some length L. We can then start with a vector v at point so 
and transport it all the way around the ribbon until we get back to the same 
point, i.e. transport from s to s + L. The vector v\ we so obtain will make 
some angle, call it <& with the vector v . The angle <f> is called the holonomy of 
the (parallel transport of the) ribbon. 

17. Show that $ is independent of the choice of sq and v n . What is its expression 
in terms of 612? 

18. What is the holonomy for a circle on the sphere in terms of its latitude. 

19. Show that if the ribbon is planar (so e\ and lie in a fixed plane) a 
vector field is parallel if and only if it is parallel in the usual sense of Euclidean 
geometry (say makes a constant angle with the x-axis). But remember that the 
curve is turning. So the holonomy of a circle in the plane is ±27r depending on 
the orientation. Similarly for the sum of the exterior angles of a triangle (think 
of the corners as being rounded out). 

Convince yourself of the following fact which is not so easy unless you know 
the trick: Show that for any smooth simple closed curve (i.e. one with no self 
intersections) in the plane the holonomy is always ±2tt. 

Exercises 15,17, and 19, together with the results above give an alternative 
interpretation of parallel transport: develop the ribbon onto the plane and then 
just translate the vector vo in the Euclidean plane so that its origin lies at the 
image of s\. Then consider the corresponding vector field along the ribbon. 

The function k in 612 = kds is called the geodesic curvature of the ribbon. 
The integral J 612 = / kds is called the total geodesic curvature of the ribbon. It 
gives the total change in angle (including multiples of 2ir) between the tangents 
to the initial and final points of the developed curve. 

2.24 Surfaces in R 3 . 

We let M be a two dimensional submanifold of R 3 and O its bundle of adapted 
frames. We have a "projection" map 

7r : O — > M, (to, ei, e2, 63) m, 

which we can also write 

7T = in. 
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Suppose that we consider the "truncated" version of the adapted bundle of 
frames O where we forget about e$. That is, let consist of all (m,e\.e 2 ) where 
m e M and e\ , e 2 is an orthonormal basis of the tangent space TM m to M at 
m. Notice that the definition we just gave was intrinsic. The concept of an 
orthonormal basis of TM m depends only on the scalar product on TM m . The 
differential of the map m : O — > M at a point (m, ei, e 2 ) sends a tangent vector 
£ to O at (m, ei, e2, 63) to a tangent vector to M at m, and the scalar product 
of this image vector with e\ is a linear function of £. We have just given an 
intrinsic of 9±. (By abuse of language I am using this same letter 9\ for the form 
(dm, ei) on O as e3 does not enter into its definition.) Similarly, we see that 02 
is an intrinsically defined form. From their very definitions, the forms 9\ and 2 
are linearly independent at every point of O. Therefore the forms d9\ and d9i 
are intrinsic, and this proves that the form G12 is intrinsic. Indeed, if we had 
two linear differential forms a and r on O which satisfied 

d9i = oA9 2 , 

d9 l = t A 2 

d9 2 = -<t A 0i 

d9 2 = -t A 0i 

then the first two equations give 

(a - t) A 2 = 

which implies that (<r — r) is a multiple of 2 and the last two equations imply 
that a — t is a multiple of 0i so a = t. The next few problems will give a (third) 
proof of Gauss's theorema egregium. They will show that 

dQ 12 = -n*(K)9 1 A9 2 

where K is the Gaussian curvature. 

This assertion is local (in M), so we may temporarily make the assumption 
that M is orientable - this allows us to look at the sub-bundle O C Oof oriented 
frames, consisting of those frames for which e x , e 2 form an oriented basis of TM m 
and where ei, e 2 , an oriented frame on R 3 . 

Let dA denote the (oriented) area form on the surface M. (A bad but 
standard notation, since we the area form is not the differential of a one form, 
in general.) Recall that when evaluated on any pair of tangent vectors, f]i,r] 2 at 
m G M it is the oriented area of the parallelogram spanned by r\\ and r\ 2 , and 
this is just the determinant of the matrix of scalar products of the 77's with any 
oriented orthonormal basis. Conclude 

20. Explain why 

ir*dA = 0i A 2 . 

The third component, e% of any frame is completely determined by the point 
on the surface and the orientation as the unit normal, n to the surface. Now n 



2.24. SURFACES IN R 3 . 



55 



can be thought of as a map from M to the unit sphere, S in R 3 . Let dS denote 
the oriented area form of the unit sphere. So n*dS is a two form on M and we 
can define the function K by 

n*dS = KdA. 



21 Show that he function K is Gaussian curvature of the surface. 

22. Show that 

n*dS = e 3 i A e 32 

and 

23. Conclude that 

d6i2 = -it* (KdA). 

We are going to want to apply Stokes' theorem to this formula. But in order 
to do so, we need to integrate over a two dimensional region. So let U be some 
open subset of M and let 

if) : U -» n- 1 U C O 

be a map satisfying 

7T o tp = id. 

So tp assigns a frame to each point of U in a diffcrentiable manner. Let C be a 
curve on M and suppose that C lies in U. Then the surface determines a ribbon 
along this curve, namely the choice of frames from which e\ is tangent to the 
curve (and pointing in the positive direction). So we have a map R : C — > O 
coming from the geometry of the surface, and (with now necessarily different 
notation from the preceding section) i?*Oi2 = kds is the geodesic curvature of 
the ribbon as studied above. Since the ribbon is determined by the curve (as M 
is fixed) we can call it the geodesic curvature of the curve. On the other hand, 
we can consider the form V'*0i2 pulled back to the curve. Let 

iPoC (s) = (C(s)J 1 (s),h(s),n(s)) 

and let </>(s) be the angle that ei(s) makes with /i(s) so 

ei(s) = cos<£(s)/i(s) + sin (f>(s)f 2 (s), e 2 (s) = - sin <f>(s)fi(s) + cos 0(s)/ 2 (s). 

24. Let C*-0*6i2 denote the pullback of ^*@12 to the curve. Show that 

kds = d(p + C*ip*Q 1 2. 

Conclude that 
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Proposition 2 The 

total geodesic curvature = (p(b) — 4>{a) + j c ip* @12 where (f>(b) — tfi(a) denotes 
the total change of angle around the curve. 

How can we construct a ip? Here is one way that we described earlier: 
Suppose that U is a coordinate chart and that Xi,X2 are coordinates on this 
chart. Then are linearly independent vectors at each point and we 

can apply Gram Schmidt to orthonomalize them. This give a ip an d the angle 
4> above is just the angle that the vector e\ makes with the x— axis in this 
coordinate system. Suppose we take C to be the boundary of some nice region, 
D, in U. For example, suppose that C is a triangle or some other polygon with 
its edges rounded to make a smooth curve. Then the total change in angle is 
2tt and so 



25. Conclude that for such a curve 



KdA + / kds = 2tt. 
d Jc 



The integral of KdA is called the total Gaussian curvature. 

26. Show that as the curve actually approaches the polygon, the contribution 
from the rounded corners approaches the exterior angle of the polygon. Con- 
clude that if a region in a coordinate neighborhood on the surface is bounded 
by continuous piecewise differentiable arcs making exterior angles at the corners 

Proposition 3 the total Gaussian curvature + ^ total geodesic curvatures + 
exterior angles — 2ir. 

27. Suppose that we have subdivided a compact surface into polygonal regions, 
each contained in a coordinate neighborhood, with / faces, e edges, and v 
vertices. Let £ = / — e + v. show that 



f KdA = 2n£. 
Jm 



Chapter 3 

Levi-Civita Connections. 



3.1 Definition of a linear connection on the tan- 
gent bundle. 

A linear connection V on a manifold M is a rule which assigns a vector field 
VxY to each pair of vector fields X and Y which is bilinear (over R) subject 
to the rules 

V fx Y = fVxY (3.1) 

and 

V x {gY) = (Xg)Y + g(V x Y). (3.2) 
While condition (3.2) is the same as the corresponding condition 

Lx(gY) = [X,gY] = (Xg)Y + gL x Y 

for Lie derivatives, condition (3.1) is quite different from the corresponding 
formula 

L fx Y = [fX,Y] = -(Yf)X + fL x Y 

for Lie derivatives. In contrast to the Lie derivative, condition (3.1) implies that 
the value of \7 xY at x € M depends only on the value X(x). 

If £ € TM X is a tangent vector at x <G M, and Y is a vector field defined in 
some neighborhood of x we use the notation 

V 6 Y := (V x Y)(x), where X{x) = £. (3.3) 

By the preceding comments, this does not depend on how we choose to extend 
£ to X so long as X(x) = £. 

While the Lie derivative is an intrinsic notion depending only on the diffcr- 
entiable structure, a connection is an additional piece of geometric structure. 
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3.2 Christoffel symbols. 

These give the expression of a connection in local coordinates: Let x 1 , . . . ,x n 
be a coordinate system, and let us write 

di-= — 
1 ' dx l 

for the corresponding vector fields. Then 

k 

where the functions are called the Christoffel symbols. We will frequently 
use the shortened notation 

Vi := V 9i . 

So the definition of the Christoffel symbols is written as 

V^ = ^r^ fe . (3.4) 

k 

If 

3 

is the local expression of a general vector field Y then (3.2) implies that 



k 



3.3 Parallel transport. 

Let C : I — > M be a smooth map of an interval J into M. We refer to C as 
a parameterized curve. We will say that this curve is non-singular if C'(t) ^ 
for any t where C'(t) denotes the tangent vector at t € I. By a vector field 
Z along C we mean a rule which smoothly attaches to each tela, tangent 
vector Z(t) to M at C(t). We will let V(C) denote the set of all smooth vector 
fields along C. For example, if V is a vector field on M, then the restriction of 
V to C, i.e. the rule 

V c (t) := V(C(t)) 

is a vector field along C. Since the curve C might cross itself, or be closed, it is 
clear that not every vector field along C is the restriction of a vector field. 

On the other hand, if C is non-singular, then the implicit function theorem 
says that for any t € / we can find an interval J containing to and a system of 
coordinates about C(t ) in M such that in terms of these coordinates the curve 
is given by 

x 1 ^) = t, x\t) =0, i > 1 
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for t £ J. If Z is a smooth vector field along C then for t e J we may write 

z(t) = J2z j (t)d j (t,o,...,o). 

3 

We may then define the vector field Y on this coordinate neighborhood by 

Y{x\...,x n ) = Y,Z j {x 1 )d j 

3 

and it is clear that Z is the restriction of Y to C on J. In other words, locally, 
every vector field along a non-singular curve is the restriction of a vector field 
of M. If Z = Yc is the restriction of a vector field Y to C we can define its 
"derivative" Z' , also a vector field along C by 

*c(t) := V C '( t) r. (3.6) 

If g is a smooth function defined in a neighborhood of the image of C, and h is 
the pull back of g to / via C, so 

h(t)=g{C{t)) 

then the chain rule says that 

h'(t) = j t 9(C(t)) = C\t)g, 

the derivative of g with respect to the tangent vector C'(t). Then if 

Z = Y C 

for some vector field Y on M (and ft- = g(C(i))) equation (3.2) implies that 

(hZ)' = h'Z + hZ'. (3.7) 

We claim that there is a unique linear map Z Z' defined on all of V(C) such 
that (3.7) and (3.6) hold. Indeed, to prove uniqueness, it is enough to prove 
uniqueness in a coordinate neighborhood, where 

z{t) = Y J zKtm)c 

3 

Equations (3.7) and (3.6) then imply that 

Z\t) = ]T (Z'"(t)(a,-)c + z!{t)Vc>V)d 3 ) ■ (3.8) 

3 

In other words, any notion of "derivative along C" satisfying (3.7) and (3.6) must 
be given by (3.8) in any coordinate system. This proves the uniqueness. On the 
other hand, it is immediate to check that (3.8) satisfies (3.7) and (3.6) if the 
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curve lies entirely in a coordinate neighborhood. But the uniqueness implies 
that on the overlap of two neighborhoods the two formulas corresponding to 
(3.8) must coincide, proving the global existence. 

We can make formula (3.8) even more explicit in local coordinates using the 
Christoffcl symbols which tell us that 

k 

Substituting into (3.8) gives 

k \ ij J 

A vector field Z along C is said to be parallel if 

Z\t) = 0. 

Locally this amounts to the Z % satisfying the system of linear differential equa- 
tions 

Hence the existence and uniqueness theorem for linear homogeneous differential 
equations (in particular existence over the entire interval of definition) implies 
that 

Proposition 4 For any £ G TM C ^ there is a unique parallel vector field Z 
along C with Z(0) = (. 

The rule t i— > C'(t) is a vector field along C and hence we can compute its 
derivative, which we denote by C" and call the acceleration of C. Whereas 
the notion of tangent vector, C, makes sense on any manifold, the acceleration 
only makes sense when we are given a connection. 



3.4 Geodesies. 

A curve with acceleration zero is called a geodesic. In local coordinates we 
substitute Z k = x k into (3.10) to obtain the equation for geodesies in local 
coordinates: 

<*»> 

where we have written x k instead of x k o C in (3.11) to unburden the notation. 
The existence and uniqueness theorem for ordinary differential equations implies 
that 
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Proposition 5 For any tangent vector £ at any point x € M there is an interval 
I about and a unique geodesic C such that C(0) = x and C"(0) = £. 

By the usual arguments we can then extend the domain of definition of the 
geodesic through £ to be maximal. 

This is the first of many definitions (or characterizations, if take this to be 
the basic definition) that we shall have of geodesies - the notion of being self- 
parallel. (In the case that all the Tfj = we get the equations for straight 
lines.) 

Suppose that C : I — > M is a (non-constant) geodesic, and we consider a 
"reparametrization" of C, i.e. consider the curve B = C o h : J ^ M where 
h : J — > / is a diffcomorphism of the interval J onto the interval /. We write 
t = h(s) so that 

dB _ dC dh 
ds dt ds 

and hence 

d 2 B d 2 C fdh\ 2 dCd 2 h dCd 2 h 



ds 2 dt 2 \ds J ^ ds ds 2 ds ds 2 

since C" — as C is a geodesic. The fact that C is not constant (and the 
uniqueness theorem for differential equations) says that C is never zero. Hence 
B is a geodesic if and only if 

^ = 
ds 2 



or 



h(s) = as + b 



where a and b are constants with a / 0. In short, the fact of being a non- 
constant geodesic determines the parameterization up to an affine change of 
parameter. 



3.5 Covariant differential. 

We can extend the notion of covariant derivative with respect to a vector field 
X (which has been defined on functions by / i— > Xf and on vector fields by 
Y i > V xY to all tensor fields: We first extend to linear differential forms by 
the rule 

(V X 9)(Y) = X(9(Y))-9(V X Y) (3.12) 

Replacing Y by gY has the effect of pulling out a factor of g since the two 
terms on the right involving Xg cancel. This shows that Vx# is again a linear 
differential form. Notice that 

V fx = fVx6 



and 



V x (ge) = (Xg)9 + gVxO. 
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We now extend Vx to be a "tensor derivation" requiring that 

V x (a <g> (3) = (V x a) <g> (3 + a <g> V x /3 
for any pair of tensor fields a and (i. For example 

V x (9 <g> Z) = V x ®Z + 6® V X Z. 

This then defines Vx on all tensor fields which are sums of products of one 
forms and vector fields. Notice that if we define the "contraction" 

C : 9(g>Z^9{Z) 
then the definition (3.12) of VxO implies that 

\7 X (C(0 ® Z)) = V x (9(Z)) = C {V x ®Z + 6® V X Z) = C{\7 X {6 ® Z)). 
in other words, Vx commutes with contraction 

V x oC=CoV x . (3.13) 

This was checked in the special case that we had a tensor of type (1,1) which 
was the tensor product of a one form and a vector field. But if we have a tensor 
of type (r,s) which is a product of one forms and vector fields, then we may form 
the contraction of any one-form factor with any vector field factor to obtain a 
tensor of type (r-l,s-l) and (3.13) continues to hold. 

If 7 is a general tensor field of type (r,s), it is completely determined by 
evaluation on all tensor fields p of type (s,r) which are products of one forms 
and vector fields. We then define Vx7 by 

(Vx7)(/>)=*(7(p))-7(Vjrp). 

In the case that 7 is itself a sum of products of one-forms and vector fields 
this coincides with our old definition. Again this implies that Vx7 is a tensor. 
Furthermore, contraction in any two positions in 7 is dual (locally) to insertion 
of Q l ® Ei into the corresponding positions in a tensor of type (s-l,r-l) where 
the Ei form a basis locally of the vector fields at each point and the 6 % form the 
dual basis. But 

since if the functions a*- are defined by V ' xEj = J2j a )Ei then Vjfl 1 = — 
as follows from (3.12). This shows that (3.13) holds in general. 

We can think of the covariant derivative as assigning to each tensor field 7 
of type (r,s) a tensor field V7 of type (r,s+l), given by 

Vj(p®X) = Vxl{p)- 

The tensor V7 is called the covariant differential of 7. 
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3.6 Torsion. 

Let V be a connection, X and Y vector fields and / and g functions. Using 
(3.1), (3.2), and the corresponding equations for Lie brackets we find 

V fx (gY) - X7 gY (fX) - [fX, gY] = fg (V X Y - \7 Y X - [X, Y]) , 

In other words the value of 

t(X,Y) :-V x y -V Y X-[X,Y] 

at any point x depends only on the values X(x),Y(x) of the vector fields at x. 
So r defines a tensor field of type (1,2) in the sense that it assigns to any pair 
of tangent vectors at a point, a third tangent vector at that point. This tensor 
field is called the torsion tensor of the connection. So a connection has zero 
torsion if and only if 

X7 X Y- VyX= [X,Y] (3.14) 

for all pairs of vector fields X and Y. In terms of local coordinates, [di, dj] = 0. 
So 

T{d it d 3 ) = Vidj = (r?j - r£) d k . 

k 

Thus a connection has zero torsion if and only if its Christoffcl symbols are 
symmetric in i and j. 

3.7 Curvature. 

The curvature R = i?(V) of the connection V is defined to be the map 
V(M) 3 — > V(M) assigning to three vector fields X, Y, Z the value 

RxyZ:= [Vx,Vy]Z-V [X; y]Z. (3.15) 

The expression [Vx,Vy] occurring on the right in (3.15) is the commutator of 
the two operators Vx and Vy, that is [Vx.Vy] = Vx o Vy — Vy o Vx- We 
first observe that R is a tensor, i.e. that the value of RxyZ at a point depends 
only on the values of X, Y, and Z at that point. To see this we must show that 

R f x g YhZ = fghRxrZ 

for any three smooth functions /, g and h. For this it suffices to check this one 
at a time, i.e. when two of the three functions are identically equal to one. For 
example, if / = 1 = h we have 

— Rx, g yZ = V[x,gY]Z — V xV ' g yZ + V 9 yV xZ 

= (Xg)V Y Z + g^ [x .Y]Z - {Xg)V x V Y Z - gV x V Y Z + gVy^xZ 
= gRxyZ. 
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Since R is anti-symmetric in X and Y we conclude that RfxyZ = fRxyZ. 
Finally, 

-R XY (hZ) = {[X,Y]h)Z + hV [x ,Y]Z-V x {{Yh)Z + hV Y Z) + 
V Y ({Xh)Z + hV x Z) 
= hR X YZ+([X,Y}h-(XY-YX)h)Z-XhX7 Y Z 

-YhSJxZ + YhW x Z + XhVyZ 
= hRxvZ. 

Thus we get a curvature tensor (of type (1,3)) which assigns to every three 
tangent vectors £, 77, ( at a point x the value 

R iv C ■= (RxyZ)(x) 

where X,Y,Z are any three vector fields with X(x) = £,Y(x) = rj,Z(x) = (. 
Alternatively, we speak of the curvature operator at the point x defined by 

R iTI : TM X -> TM X , R iv : ( ^ i? e „C- 

As we mentioned, the curvature operator is anti-symmetric in £ and rj: 

R^ v = —R^- 

The classical expression of the curvature tensor in terms of the Christoffcl 
symbols is obtained as follows: Since [dk,d(\ = 0, 

Rd k dA = Vk(V/0,-) - Ve{V k dj) 

= - E i^ 9 ™ + E ^rA 9r + E (^ d - - E r ^ r 

m \ m.r / m \ m,r 

= E Rjke^i 



where 



g d ■» ■» 



If the connection has zero torsion we claim that 

R^C + Rr,^ + Rav = 0, (3.17) 

or, using the cyclic sum notation we introduced with the Jacobi identity, that 

Cyc R £r ,( = 0. 
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To prove this, we may extend £, rj, and £ to vector fields whose brackets all 
commute (say by using vector fields with constant coefficients in a coordinate 
neighborhood). Then 

RxyZ = V x VyZ - V Y VxZ. 

Therefore 

CycRxyZ = CycX7 x \7 Y Z - Cyc\7 Y \7 x Z 
= CycVxVzY-CycVxVzY 

since making a cyclic permutation in an expression Cyc F(X, Y, Z) docs not 
affect its value. But the fact that the connection is torsion free means that we 
can write the last expression as 

Cyc Vx[Y,Z] = 

by our assumption that all Lie brackets vanish. QED 

3.8 Isometric connections. 

Suppose that M is a semi-Riemannian manifold, meaning that we are given 
a smoothly varying non-degenerate scalar product ( , ) x on each tangent space 
TM X . Given two vector fields X and Y, we let (X, Y) denote the function 

(X,Y)(x) := (X(x),Y(x)) x . 

We say that a connection V is isometric for ( , ) if 

X(Y, Z) = (V X Y, Z) + (Y, V X Z) (3.18) 

for any three vector fields X, Y, Z. It is a sort of Leibniz's rule for scalar prod- 
ucts. If we go back to the definition of the derivative of a vector field along a 
curve arising from the connection V, we see that (3.18) implies that 

±(Y,Z) = (Y>,Z) + (Y,Z>) 

for any pair of vector fields along a curve C. In particular, if Y and Z are 
parallel along the curve, so that Y' = Z' = 0, we see that (Y, Z) is constant. 
This is the key meaning of the condition that a connection be isometric: parallel 
translation along any curve is an isometry of the tangent spaces. 

3.9 Levi-Civita's theorem. 

This asserts that on any semi-Riemannian manifold there exists a unique con- 
nection which is isometric and is torsion free. It is characterized by the Koszul 
formula 
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2(V V W,X) = 

V(W, X) + W(X, V) -X(V, W) - (V, [W, X]) + (W, [X, V]) + (X, [V, W}) (3.19) 

for any three vector fields X,V,W. To prove Koszul's formula, we apply the 
isometric condition to each of the first three terms occurring on the right hand 
side of (3.19). For example the first term becomes {V V W, X) + (W, V V X). We 
apply the torsion free condition to each of the last three terms. For example 
the last term becomes (X, \7 V W — VwV). There will be a lot of cancellation 
leaving the left hand side. Since the vector field VylF is determined by knowing 
its scalar product (VvW,X) for all vector fields X, the Koszul formula proves 
the uniqueness part of Lcvi-Civita's theorem. 

On the other hand, the right hand side of the Koszul formula is function 
linear in X, i.e. 

(V v W,fX) = f(V v W,X) 

as can be checked using the properties of V and Lie bracket. So we obtain a 
well defined vector field, VyW and it is routine to check that this satisfies the 
conditions for a connection and is torsion free and isometric. 

We can use the Koszul identity to derive a formula for the Christoffcl symbols 
in terms of the metric. First some standard notations: We will use the symbol 
g to stand for the metric, so g is just another notation for ( , ). In a local 
coordinate system we write 

9ij ■= (di,dj) 

so 

g = J~]gijdx l <g> dx 3 . 

ij 

Here the g^ arc functions on the coordinate neighborhood, but we are suppress- 
ing the functional dependence on the points in the notation. The metric g is 
a (symmetric) tensor of type (0,2). It induces an isomorphism (at each point) 
of the tangent space with the cotangent space, each tangent vector £ going into 
the linear function (£, •) consisting of scalar product by £. By the above formula 
the map is given by 

3 

This isomorphism induces a scalar product on the cotangent space at each point, 
and so a tensor of type (2,0) which we shall denote by g or sometimes by g f j- 
We write 

g 13 := (dx\dx 3 ) 

so 

g ^ !r-'-l i>,. 

ij 
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(The transition from the two lower indices to the two upper indices is the reason 
for the vertical arrows notation.) The metric on the cotangent spaces induces a 
map into its dual space which is the tangent space given by 

dx % i ► y^ g lj dj 

and the two maps - from tangent spaces to cotangent spaces and vice versa - 
are inverses of one another so 

k 

the "matrices" (gij) and g kl ) are inverses. 

Now let us substitute X = d m ,V = di,W = dj into the Koszul formula 
(3.19). All brackets on the right vanish and we get 

2{Vidj,d m ) = di(g jm ) + dj(g im ) - d m (gij). 

Since __ 

k 

is the definition of the Christoffel symbols, the preceding equation becomes 
2 ^ T ii9am = di(g jm ) + dj(g im ) - d m {gij). 

a 

Multiplying this equation by g mk and summing over m gives 

m K J 

In principle, we should substitute this formula into (3.11) and solve to obtain 
the geodesies. In practice this is a mess for a general coordinate system and 
so we will spend a good bit of time developing other means (usually group 
theoretical) for finding geodesies. However the equations are manageable in 
orthogonal coordinates. 



3.10 Geodesies in orthogonal coordinates. 

A coordinate system is called orthogonal if 

9ij =0, i + j- 

If we are lucky enough to have an orthogonal coordinate system the equations 
for geodesies take on a somewhat simpler form. First notice that (3.20) becomes 

r k = -n kk i d9jk i d9ik - dgij \ 
11 r \ dx* dxi dx k J 
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So (3.11) becomes 

d 2 x k fcfc \ - dg kk dx k dx i 1 fcfc \p dgg dx i dx l _ 
dt 2 9 ^ dx* dt dt 2 9 dx k dt dt ~ ■ 

If we multiply this equation by g k k and bring the negative term to the other 
side we obtain 




(3.21) 



as the equations for geodesies in orthogonal coordinates. 

3.11 Curvature identities. 

The curvature of the Levi-Civita connection satisfies several additional identities 
beyond the two curvature identities that we have already discussed. Let us 
choose vector fields X,Y,V with vanishing brackets. We have 

-(R XY V,V) = -(V X V Y V,V) + (V Y V X V,V) 

= Y(W X V, V) - (V X V, V Y V) - X(V Y V, V) + (Vy V, V X V) 

= l -YX(V,V)- l -XY(V,V) 

= \[X,Y]{V,V) 
= 0. 

This implies that for any three tangent vectors we have 

(i^C,C) = o 

and hence by polarization that for any four tangent vectors we have 

(R £ri v,0 = -(v,R £ri 0. (3.22) 

This equation says that the curvature operator R^ n acts as an infinitesimal 
orthogonal transformation on the tangent space. 

The last identity we want to discuss is the symmetry property 

(R^v,C) = (Rv^,v)- (3.23) 
The proof consists of starting with the identity 

CycR n . v £ = 

and taking the scalar product with £ to obtain 

(Cyc R^, C) = 0. 
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This is an equation involving three terms. Take the cyclic permutation of the 
four vectors to obtain four equations like this involving twelve terms in all. 
When we add the four equations eight of the terms cancel in pairs and the 
remaining terms give (3.23). We summarize the symmetry properties of the 
Ricmann curvature: 

• R^ v = —R^ 

• (Rt v v,Q = -( v RtvO 

. i? e „c + fl„c£ + R av = o 

• (Rt v v,Q = (Rv&v)- 



3.12 Sectional curvature. 

Let V is a vector space with a non-degenerate symmetric bilinear form (•,•). A 
subspace is called non-degenerate if the restriction of the (■,■) to this sub- 
space is non-degenerate. (If (•,•) is positive definite, then all subspaces are 
non-degenerate.) A two dimensional subspace II is non-degenerate if and only 
if for any basis v,w of II the "area " 

Q(v,w)) := (v,v)(w,w) - (v,w) 2 

does not vanish. 

Let II be a non-degenerate plane (=two dimensional subspace) of the tangent 
space TM X of a scmi-Ricmannian manifold. Then its sectional curvature is 
defined as 

W) ■= (3.24) 
Q(v,w) 

It is easy to check that this is independent of the choice of basis v, w. 



3.13 Ricci curvature. 

If we hold £ e TM X and r\ G TM X fixed in R^ v rj then the map 

v i ^ R^ v r] v G TM X 

is a linear map of TM X into itself. Its trace (which is biinear in £ and 77) is 
known as the Ricci curvature tensor. 

Ric(^ rj) := tr[w ^ R x , v r]]. (3.25) 

Ricci curvature plays a key role in general relativity because it is the Ricci 
curvature rather than than the full Riemann curvature which enters into the 
Einstein field equations. 
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3.14 Bi-invariant metrics on a Lie group. 

The simplest example of a Riemman manifold is Euclidean space, where the 
geodesies are straight lines and all curvatures vanish. We may think of Euclidean 
space as a commutative Lie group under addition, and view the straight lines as 
translates of one parameter subgroups (lines through the origin). An easy but 
important generalization of this is when we consider bi-invariant metrics on a 
Lie group, a concept we shall explain below. In this case also, the geodesies are 
the translates of one parameter subgroups. 

3.14.1 The Lie algebra of a Lie group. 

Let G be a Lie group. This means that G is a group, and is a smooth manifold 
such that the multiplication map G x G —* G is smooth, as is the map inv 
:G — > G sending every element into its inverse: 

inv : a i— > a -1 , a G G. 

Until now the Lie groups we studied were given as subgroups of Gl(n). We can 
continue in this vein, or work with the more general definition just given. We 
have the left action of G on itself 

L a : G — > G, bt-^> ab 

and the right action 

R a :G^G, b^ba- 1 . 
We let g denote the tangent space to G at the identity: 

S = TG e . 

We identify g with the space of all left invariant vector fields on G, so £ G g 
is identified with the vector field X which assigns to every a <G G the tangent 
vector 

d(L a U € TG a . 

We will alternatively use the notation X, Y or £, 77 for elements of g. 

The left invariant vector field X generates a one parameter group of trans- 
formations which commutes with all left multiplications and so must consist 
of a one parameter group of right multiplications. In the case of a subgroup 
of Gl(ri), where g was identified with a subspace of of the space of all n x n 
matrices, we saw that this was the one parameter group of transformations 

A A cxp tX, 

i.e. the one parameter group 

^cxp -tx- 

So we might as well use this notation in general: exp tX denotes the one parame- 
ter subgroup of G obtained by looking at the solution curve through e of the left 
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invariant vector field X, and then the one parameter group of transformations 
generated by the vector field X is R CKp - t x- 

Let X and Y be elements of g thought of as left invariant vector fields, and 
let us compute their Lie bracket as vector fields. So let 

4>t = -Roxp -tx 

be the one parameter group of transformations generated by X. According to 
the general definition, the Lie bracket [X, Y] is obtained by differentiating the 
time dependent vector field 4>* t Y at t — 0. By definition, the pull-back <$>* t Y is 
the vector field which assigns to the point a the tangent vector 

(# t )" 1 F(^(a)) = (dR cxptx ) a Y(a(cxptX)). (3.26) 

In the case that G is a subgroup of the general linear group, this is precisely the 
left invariant vector field 

a i ► a(e exp tX)Y (exp — tX) . 

Differentiating with respect to t and setting t = shows that the vector field 
[X, Y] is precisely the left invariant vector field corresponding to the commutator 
of the two matrices X and Y. 

We can mimic this computation for a general Lie group, not necessarily given 
as a subgroup of Gl{n): First let us record the special case of (3.26) when we 
take a = e: 

(d^YiMe)) = (dR cxptx ) Y (exptX)). (3.27) 
For any a £ G we let A a denote conjugation by the element a £ G, so 

A a :G^G,A a (b)=aba- 1 . 

We have A a (e) = e and A a carries one-parameter subgroups into one param- 
eter subgroups. In particular the differential of A a at TG e = g is a linear 
transformation of g which we shall denote by Ad a : 

d(A a ) e =: Ad a : TG e -> TG e . 

We have 

A a = L a o R a = R a o L a . 

So if Y is the left invariant vector field on G corresponding to 77 £ TG e = g, we 
have dL a (rj) = Y(a) and so 

d(A a )eV - d(R a ) a o d(L a ) e f] = d(R a ) a Y(a). 

Set a = exptX, and compare this with (3.27). Differentiate with respect to t 
and set t = 0. We see that the left invariant vector field [X, Y] corresponds to 
the element of TG e obtained by differentiating Ad cxpt x r\ with respect to t and 
setting t = 0. In symbols, we can write this as 

^Ad oxp tX| t=0 -ad(X) where ad(X) : g -» g, &d(X)Y = [X, Y]. (3.28) 
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Now ad(X) as defined above is a linear transformation of g. So we can consider 
the corresponding one parameter group cxpiad(X) of linear transformations 
of g (using the usual formula for the exponential of a matrix). But (3.28) 
says that Ad oxp tx is a one parameter group of linear transformations with the 
same derivative, ad(X) at t = 0. The uniqueness theorem for linear differential 
equations then implies the important formula 

exp(tad(X)) = Ad cxptx . (3.29) 
3.14.2 The general Maurer-Cartan form. 

If v £ TG a is tangent vector at the point a £ G, there will be a unique left 
invariant vector field X such that X(a) = v. In other words, there is a linear 
map 

oj a : TG a ^g 

sending the tangent vector v to the element £ = oj a (v) £ Q where the left 
invariant vector field X corresponding to £ satisfies X(a) = v. So we have 
defined a q valued linear differential form w identified the tangent space at any 
a £ G with q. If 

dLf,v = w £ TGb a 

then X(ba) — w since X(v) — v and X is left invariant. In other words, 

UJ Lb a O dL b = UJ a , 

or, what amounts to the same thing 

Liu = w 

for all b £ G. The form u is left invariant. When we proved this for a subgroup of 
Gl(n) this was a computation. But in the general case, as we have just seen, it is 
a tautology. We now want to establish the generalization of the Maurer-Cartan 
equation (2.9) which said that for subgroups of Gl(n) we have 

duo + ui A u = 0. 

Since we no longer have, in general, the notion of matrix multiplication which 
enters into the definition of u> A oj, we must first must rewrite u> A u> in a form 
which generalizes to an arbitrary Lie group. 

So let us temporarily consider the case of a subgroup of Gl(n). Recall that 
for any two form r and a pair of vector fields X and Y we write t(X, Y) = 
i(Y)i(x)r. Thus 

(w A oj)(X, Y) = uj(X)oj(Y) - uj(Y)uj(X), 

the commutator of the two matrix valued functions, oj(X) and u(Y). Consider 
the commutator of two matrix valued one forms, oj and a, 



cu A a + a A oj 
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(according to our usual rules of superalgebra). We denote this by 

[wA, a]. 

In particular we may take u> = a to obtain 

[wA, u] = 2uo A uj. 

So we can rewrite the Maurer-Cartan equation for a subgroup of Gl(n) as 

duo+ ^[a)A,w] = 0. (3.30) 
Now for a general Lie group we do have the Lie bracket map 

So we can define the two form [wA,w]. It is a g valued two form which satisfies 

i(X)[wA,u] = [X,uj] - [u,X] 
for any left invariant vector field X. Hence 

[ojA,cj](X,Y) := i(Y)i(X)[ujA,oj} =i(Y)([X,w] - [w,X]) 

= [X,Y]-[Y,X] = 2[X,Y] 

for any pair of left invariant vector fields X and Y. So to prove (3.30) in general, 
we must verify that for any pair of left invariant vector fields we have 

duj(X,Y) = -oj([X,Y}). 

But this is a consequence of our general formula (2.3) for the exterior derivative 
which in our case says that 

duj(X, Y) = Xuj(Y) - Yuj(X) - u([X, Y]). 

In our situation the first two terms on the right vanish since, for example, 
oj(Y) = Y = rj a constant element of g so that Xuj(Y) = and similarly 
Yuj(X) = 0. 



3.14.3 Left invariant and bi-invariant metrics. 

Any non-degenerate scalar product, ( , ), on g determines (and is equivalent to) 
a left invariant semi-Riemann metric on G via the left-identification dL a : g = 
TG e — > TG a , VaeG, 

Since A a = L a o R a , the left invariant metric, ( , ) is right invariant if and 
only if it is A a invariant for all a e G, which is the same as saying that ( , ) is 
invariant under the adjoint representation of G on g, i.e. that 



(Ad a Y, Ad a Z) = (Y, Z), VK, Z e g, a e G. 
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Setting a = exptX, leg, differentiating with respect to t and setting t = 
gives 

([X,Y],Z) + (Y,[X,Z]) = 0, VI,r,Zeg. (3.31) 

If every element of G can be written as a product of elements of the form 
exp£, £ <G g ( which will be the case if G is connected), this condition implies 
that ( , ) is invariant under Ad and hence is invariant under right and left 
multiplication. Such a metric is called bi- invariant. 

Let inv denote the map sending every element into its inverse: 

inv : a a -1 , a e G. 

Since inv exp tX = exp(-tX) we see that 

d inv e = —id . 

Also 

inv = R a -i o inv o L a -i 
since the right hand side sends b e G into 

b i ► a _1 6 i ► 6 _1 a 6 _1 . 

Hence d inv a : TG a — > TG a -i is given, by the chain rule, as 

dR a -i o dinv e o dL a -i = —dR a -i o dL a -i 

implying that a bi-invariant metric is invariant under the map inv. Conversely, 
if a left invariant metric is invariant under inv then it is also right invariant, 
hence bi-invariant since 

R a = inv o L^ 1 o inv . 

3.14.4 Geodesies are cosets of one parameter subgroups. 

The Koszul formula simplifies considerably when applied to left invariant vector 
fields and bi-invariant metrics since all scalar products are constant, so their 
derivatives vanish, and we are left with 

2(V x y, Z) = -{X, [Y, Z\) - (Y, [X, Z\) + (Z, [X, Y]) 

and the first two terms cancel by (9.50). We are left with 

VxY= l -[X,Y\. (3.32) 

Conversely, if ( , ) is a left invariant metric for which (9.51) holds, then 

(X,[Y,Z]) = 2(X,V Y Z) 
= -2(V Y X,Z) 

= ~([y,x],z) 

= {[X,Y\,Z) 
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so the metric is bi-invariant. 

Let a be an integral curve of the left invariant vector field X. Equation 
(9.51) implies that a" = \7 X X — so a is a geodesic. Thus the one-parameter 
groups are the geodesies through the identity, and all geodesies are left cosets 
of one parameter groups. (This is the reason for the name exponential map in 
Riemannian geometry which we shall study in Chapter V.) 

In Chapter VIII we will study Riemannian submersions. It will emerge from 
this study that if a we have a quotient space B = G/H of a group with a bi- 
invariant metric (satisfying some mild conditions), then the geodesies on B in 
the induced metric are orbits of certain one parameter subgroups. For example, 
the geodesies on spheres are the great circles. 



3.14.5 The Riemann curvature of a bi-invariant metric. 

We compute the Riemann curvature of a bi-invariant metric by applying the 
definition (3.15) to left invariant vector fields: 

RxyZ = \[X, % Z]] - \[Y, [X, Z\\ - \[[X,Y], Z] 

Jacobi's identity implies the first two terms add up to \[[X, Y], Z] and so 

R XY Z=-^[[X,Y],Z}. (3.33) 

3.14.6 Sectional curvatures. 

In particular 

(R XY X,Y) = -\{[[X,Y],X],Y) = -\{[X,Y]AX,Y]) 

so 

K(XY) _l\][X 2 Y]\l 

K{X ' Y >-4\\XAY\\2- (3 ' 34) 

3.14.7 The Ricci curvature and the Killing form. 

Recall that for each X e g the linear transformation of g consisting of bracketing 
on the left by X is called ad X. So 

adl:5^ 5 , ad X(V) := [X, V}. 

We can thus write our formula for the curvature as 

RxvY = J(ad F)(ad X)V. 

Now the Ricci curvature was defined as 



Ric (X,Y) =tr [V^RxvY]. 
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We thus see that for any bi-invariant metric, the Ricci curvature is always given 

by 

Ric = ^B (3.35) 

where B, the Killing form, is defined by 

B(X, Y) := tr (ad X)(ad Y). (3.36) 

The Killing form is symmetric, since tr (AC) = tr CA for any pair of linear 
operators. It is also invariant. Indeed, let p : g — > g be any automorphism of 
0, so p([X, Y]) = [p(X) ) p(Y)\ for all 1,7 e g. We can read this equation as 
saying 

ad (p(X))(p(Y)) = n(ad(X)(Y)) 

or 

ad (p(X)) = p o ad Xp^ 1 . 

Hence 

ad ( M (A))ad (f*(y)) =/ioad Aad F^ 1 . 
Since trace is invariant under conjugation, it follows that 

B(p(X),p(Y)) = B(X,Y). 

Applied to p = cxp(tad Z) and differentiating at t — shows that B([Z, X], Y) + 
B(X,[Z,Y])=0. 

So the Killing form defines a bi-invariant symmetric bilinear form on G. Of 
course it need not, in general, be non-degenerate. For example, if the group 
is commutative, it vanishes identically. A group G is called semi-simple if its 
Killing form is non-degenerate. So on a semi-simple Lie group, we can always 
choose the Killing form as the bi-invariant metric. For such a choice, our formula 
above for the Ricci curvature then shows that the group manifold with this 
metric is Einstein, i.e. the Ricci curvature is a multiple of the scalar product. 

Suppose that the adjoint representation of G on q is irreducible. Then g can 
not have two invariant non-degenerate scalar products unless one is a multiple 
of the other. In this case, we can also conclude from our formula that the group 
manifold is Einstein. 

3.14.8 Bi-invariant forms from representations. 

Here is a way to construct invariant scalar products on a Lie algebra g of a 
Lie group G. Let p be a representation of G. This means that p is a smooth 
homomorphism of G into Gl(n, R) or Gl(n, C). This induces a representation 
P of by 

p(X) := —p(cxptX) lt=0 . 

So 

p : g -» gl(n) 
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where the bracket on the right is in gl(n). More generally, a linear map p : g — > 
gl(n, C) or gl(n, R) satisfying the above identity is called a representation of 
the Lie algebra g. Every representation of G gives rise to a representation of 
g but not every representation of g need come from a representation of G in 
general. 

If p is a representation of g, with values in gl(n, R), we may define 



tr (p(X)p(Y)p(Z) - p(Y)p(X)p(Z) + p(Y)p(X)p(Z) - p(Y) p(Z) p{X)) = 0. 



So this is invariant. Of course it need not be non-degenerate. 

A case of particular interest is when the representation p takes values in 
u{n), the Lie algebra of the unitary group. An element of u{n) is a skew adjoint 
matrix, i.e. a matrix of the form iA where A = A* is self adjoint. If A = A* 
and A = (a^) then 



which is positive unless A = 0. So 

-tv{iA){iA) 

is positive unless A — 0. This implies that if p : q — > u(n) is injective, then the 
form 



is a positive definite invariant scalar product on g. 

For example, let us consider the Lie algebra g = u(2) and the representation 
p of g on the exterior algebra of C 2 . We may decompose 



and each of the summands is invariant under our representation. Every element 
of u(2) acts trivially on A°(C 2 ) and acts in its standard fashion on A 1 (C 2 ) = 
C 2 . Every element of u{2) acts via multiplication by its trace on A 2 (C 2 ) so in 
particular all elements of su{2) act trivially there. Thus restricted to su{2), the 
induced scalar product is just 





(X,Y) = -tvp(X)p(Y) 



A(C 2 ) = A°(C 2 ) e A^C 2 ) e A 2 (C 2 ) 



(X,Y) = -tvXY, X,Yesu(2), 
while on scalar matrices, i.e. matrices of the form S = ril we have 
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3.14.9 The Weinberg angle. 

The preceding example illustrates the fact that if the adjoint representation of 
2 is not irreducible, there may be more than a one parameter family of invariant 
scalar products on g. Indeed the algebra u{2) decomposes as a sum 

u(2) = su(2) u(l) 

of subalgebras, where it(l) consists of the scalar matrices (which commute with 
all elements of u{2)). It follows from the invariancc condition that u(l) must be 
orthogonal to su(2) under any invariant scalar product. Each of these summands 
is irreducible under the adjoint representation, so the restriction of any invariant 
scalar product to each summand is determined up to positive scalar multiple, 
but these multiples can be chosen independently for each summand. So there 
is a two parameter family of choices. 

In the physics literature it is conventional to write the most general invariant 
scalar product on u{2) as 



(A, B) = [A- htrA)l) [B- ^(trB)/) +-\trAtrB 

9i V 2 / V 2 / 9i 



where gi and gi are sometimes called "coupling strengths" . The first summand 
vanishes on u(l) and the second summand vanishes on su(2). The Weinberg 
angle 6\y is defined by 

sm W := 

and plays a key role in Electro- Weak theory which unifies the electromagnetic 
and weak interactions. In the current state of knowledge, there is no broadly 
agreed theory that predicts the Weinberg angle. It is an input derived from 
experiment. The data as of July 2002 from the Particle Data Group gives a 
value of 

sin 2 6 W = 0.23113... . 
Notice that the computation that we did from the exterior algebra has 

2 

g\ = - and g\ = 2 

so 

2 



sin 2 6 W = = .25 

|+2 



Of course several quite different representations will give the same metric or 
Weinberg angle. 



3.15 Frame fields. 

By a frame field we mean an n-tuplet E = (E\, . . . , E n ) of vector fields (defined 
on some neighborhood) whose values at every point form a basis of the tangent 
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space at that point. These then define a dual collection of differential forms 




whose values at every point form the dual basis. For example, a coordinate sys- 
tem x 1 , . . . , x n provides the frame field d\, . . . , d n with dual forms dx 1 , ■ ■ ■ , dx n . 
But the use of more general frame fields allows for flexibility in computation. 

A frame field is called "orthonormal" if (Ei,Ej) = for i + j and 
(Ei,Ej) = 6j where ej = ±1 . For example, applying the Gram-Schmidt 
procedure to an arbitrary frame field for a positive definite metric yields an 
orthonormal one. 



3.16 Curvature tensors in a frame field. 

In terms of a frame field the curvature tensor is given as 

]Ti?* M i^W where R) ke = &\R EkEl E 3 ). 

The Ricci tensor, which as we mentioned, plays a key role in general relativity, 
takes the form 

Ric = Rrj^O 3 where R l0 = Ric{E u Ej) := ^ R^. 

If the frame is orthonormal then for any pair of vector fields V, W we have 

Ric(V,W) =J2^m(RvE m E m ,W). 

A manifold is called Ricci flat if its Ricci curvature vanishes. 
The scalar curvature S is defined as 

3.17 Frame fields and curvature forms. 

Let M be a semi-Riemannian manifold. Let E\, . . . ,E n be an "orthonormal" 
frame field defined on some open subset of M. (In order not to clutter up the 
notation we will not introduce a specific name for the domain of definition of 
our frame field.) This means the Ei are vector fields and 

{Ei,Ej) = Q, i + j 



while 



(Ei, Ei) = ei, ei — ±1. 
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Thus Ei(p), . . . , E n (p) form an "orthonormal" basis of the tangent space TM p 
at each point p in the domain of definition. The dual basis of the cotangent 
space then provides a family of linear differential forms, 9 1 , . . . , 9 n . It follows 
from the definition, that if v £ TM p then 

M = ei (0») 2 + ■•■ + £„ {e n {v)f. 

This equation, true at all points in the domain of definition of the frame field is 
usually written as 

ds 2 = e 1 {e 1 f + • • • + e n {9 n ) 2 . (3.37) 

Conversely, if 9 1 ,...,9 n is a collection of linear differential forms satisfying 
(3.37) (defined on some open set) then the dual vector fields constitute an 
"orthonormal" frame field. 

On any manifold, we have the tautological tensor field of type (1,1) which 
assigns to each tangent space the identity linear transformation. We will denote 
this tautological tensor field by id. Thus for any p £ M and any v £ TM p , 

id(v) = v. 

In terms of a frame field we have 

id = e 1 <g> e 1 + •••£„ <g> e n 

in the sense that both sides yield v when applied to any tangent vector v in 
the domain of definition of the frame field. We can say that the 9 l give the 
expression for id in terms of the frame field and also introduce the "vector of 
differential forms" 

( 61 

:= : 

\ 9 n 

as a shorthand for the collection of the 9 % . 

For each i the Levi-Civita connection yields a tensor field V-Bj , the covariant 
differential of E t with respect to the connection, and hence linear differential 
forms uij defined by 

^■(0 = ^(V^). (3.38) 

So 

V^- = 5>7(Oi5m. 

m 

The first structure equation of Cartan asserts that 

d6 i = -J2w i m A6 m . (3.39) 

m 

To prove this, wc apply the formula (2.3) which says that 



d6(X, Y) = X(9(Y)) - Y(9(X)) - 9([X, Y]) 
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holds for any linear differential form 6 and vector fields X and Y. We apply 
this to 6 l ,E ai E b to obtain 

de\E a , E b ) = E a e\E b ) - E b e\E a ) - e i ([E a , E b \). 

Since 9 t (E b ) and t (E a ) = or 1 are constants, the first two terms vanish and 
so the left hand side of (3.39) when evaluated on (E a ,E b ) becomes 



-&([E a ,E b \). 



As to the right hand side we have 



AO 1 



Notice that 
Since 
we have 



= [- £ <(EaW m + £ 9 m (E a )w i m ] (E b ) 

= -4(E a )+wi(E b ) 

= -9 l {V Ea {E b )-V Eb {E a )) 

= -9*({E a ,E b }). QED 

^)=6\V i E j ) = e i {V i E h E i ). 
= d(E l ,E J ) 



(3.40) 



In particular lo\ = 0. If we introduce the "matrix of linear differential forms" 

cj := (4) 

we can write the first structural equations as 

d6 + cj A 9 = 0. 

For tangent vectors £, 77 £ TM,, let 77)) be the matrix of the curvature 

operator R^ v with respect to the basis E\ (p) , . . . , E n (p) . So 



R iv (E j )(p) = J2^v)E i 



Since R v _^ = —R^. v , ^(£,77) = — ^(77, £) so the fi*- are exterior differential 
forms of degree two. 

Cartan's second structural equation asserts that 



(3.41) 
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We have 

RE a E b (Ej) = ^ fl % j(E a , E b )Ej 

i 

by definition. We must show that the right hand side of (3.41) yields the same 
result when we substitute E a ,E b into the differential forms, multiply by Ei and 
sum over i. 
Write 

RE a E b {Ej) = V Ea (V Eb E 3 ) - W Eb (W Ea E,) - V [E ^ Eb] E r 
Since V Eb {E 3 ) = E 4 ^(^)^ we get 

VE b (v Ea E j ) = ^MiEam+^nE^E^ 

^[E a ,E b] Ej = Y, w % E «' E ^ E i so 
RE a E b Ej = ]T [Ea^iEb) - E b u){E a ) - uj){[E a ,E b ])] E { 

+ Y HXEaW?{E b )-^ m {E b )u J f{E a )] E^ 

The first expression in square brackets is the value on E a , E b of dcuj by (2.3) while 
the second expression in square brackets is the value on E a , E b of ^ u) l m A loJ 1 . 
This proves Cartan's second structural equation. 
We can write the two structural equations as 

de + ujhe = o (3.42) 

dtu + uAuj = n (3.43) 

3.18 Cartan's lemma. 

We will show that the equations (3.42) and (3.40) determine the ojj. First a 
result in exterior algebra: 

Lemma 1 Let x\, . . . ,x p be linearly independent elements of a vector space, V, 
and suppose that y\, . . . y p £ V satisfy 

x\ A yi H x p A y p = 0. 

Then 

p 

yj = Y A ]kXk with A jk = A kj . 
k=i 
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Proof. Choose x p+ i, . . . , x n if p < n so as to obtain a basis of V and write 

p n 
j = l k=p+l 

Substituting into the equation of the lemma gives 

^ (Aij - Aji)xi A Xj + ^ B lk Xi Ax k = 0. 

i<j<p i<p<k 

Since the Xi A xg, i < I form a basis of A 2 (F), we conclude that B ik = and 
A^ = Aji which is the content of the lemma. 

Suppose that u> and u>' are two matrices of one forms which satisfy (3.39). 
Then their difference, a := uo — u' satisfies a A 9 — 0. Applying the lemma we 
conclude that 

a k = 51 ^jfe^i — A l kj - 

If we set 

B) k = tiA) k 

and if both u and w' satisfy (3.40) so that a does as well, then 
B) k = Bl i and IS[, IS!,. 

We claim that these two equations imply that all the B l - k = and hence that 
cr = 0. Indeed, 

' " - % - - i t 

= B j ik = -B) k . 

The upshot is that if we have found u> satisfying (3.42) and (3.40) then we know 
that it is the matrix of connection forms. 



3.19 Orthogonal coordinates on a surface. 

If n = 2 there is only one independent linear differential form in u> namely 

lo\ = -eie 2 u>2- 

Suppose that (u, v) are orthogonal coordinates on the surface which means that 

(d u ,d v ) = 0. 
Set e := \/\E\ and q := \/\G\ where 

E := (d u , d u ) := eie 2 , G := (d v ,d v ) := e 2 fl 2 . 

The frame field 

Ei := -d u , E 2 := -d v 
l 9 
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is "orthonormal" with dual frame given by 

8 1 = cdu, 6 2 = gdv. 

Taking exterior derivatives yields 

dO 1 — t v dv Adu = —(t v /o)du A9 2 
dO 2 — Q u du A dv = —(Q u /t)du A 9 1 . 

Hence 

<A = {t v /Q)du - eie 2 (0„/e)dw 

by the uniqueness of the solution to the Cartan equations. In two dimensions 
the second structural equation reduces to 

£l\ = du\ 

and we compute 

du\ = -[{t v /g) v + ei^igu/ t)u]du A dv = — -[{t v /g) v + eie 2 (fl u /e) tl ]6' 1 A 9 2 . 
The sectional curvature (=the Gaussian curvature) is then given by 

K = € 1 Sll(E 1 ,E 2 ) = --[(t v /B) v + eie 2 (fl u /e)„]. (3.44) 

We obtained this formula in the positive definite case by much more complicated 
means in the first chapter. 

Exercises. 1. 

3.20 The curvature of the Schwartzschild metric 

We use polar coordinates on space and t for time so coordinates t, r, i9, <p and 
introduce the shorthand notation 

5:=sini9, C:=cos#. 

We fix a positive real number M and assume that 

r > 2M. 

The Schwartzschild metric is given as 

ds 2 = -(9 ) 2 + (9 1 ) 2 + (9 2 ) 2 + (9 3 ) 2 where 

9° = Vhdt, h:=l- — 

r 

9 1 = ^=dr 

Vh 

9 2 = rd$ 

9 3 = rSd(f> 
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1. Compute d\fh and then each of the d6 l , i = 0, 1, 2, 3. 

2. Find the connection form matrix u>. 

3. Find the curvature form matrix Sl = dw+wAw. 

4. Show that the Schwarzschild metric is Ricci fiat. 



5. Find the sectional curvatures of the "coordinate planes", i.e. the planes 
spanned by any two of dt 7 d r ,d$,d,p. 

6. The space of the Schartzschild metric is the "twisted product" of the 
"Schwartzschild plane" B spanned by r, t with the metric given by — (9°) 2 + (9 1 ) 2 
Oand the unit sphere S in the sense that the metric has the form 

g == gB + r 2 g s - 

From this fact alone (i.e. using non of the preceding computations) together 
with Koszul's formula show that 

(V X Y,Z) =0 

if X and Y are vector fields on B and Z is a vector field on S (all thought of as 
vector fields on the full space). 

Exercises 2. 



3.21 Geodesies of the Schwartzschild metric. 

The purpose of this problem set is to go through the details of two of the famous 
results general relativity, the explanation of the advance of the perihelion of Mer- 
cury and the deflection of light passing near the sun. (Einstein, 1915). In order 
to get results in useful form, we shall explicitly include Newton's gravitational 
constant G 

The equations for geodesies in a local coordinate system on a semi-Riemannian 
manifold are 



where 



™ \ / 



One of the postulates of general relativity is that a "small" particle will 
move along a geodesic in a four dimensional Lorentzian manifold whose met- 
ric is determined by the matter distribution over the manifold. Here the word 
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"small" is taken to mean that the effect of the mass of the particle on the metric 
itself can be ignored. We can ignore the mass of a planet when the metric is 
determined by the mass distribution of the stars. This notion of "small" or 
"passive" is similar to that involved in the equations of motion of a charged 
particle in an electromagnetic field. General electromagnetic theory says that 
the particle itself affects the electromagnetic field, but for "small" particles we 
ignore this and treat the particles as passively responding to the field. Similarly 
here. We will have a lot to say about the philosophical underpinnings of the 
postulate "small particles move along geodesies" when we have enough mathe- 
matical machinery. The theory also specifies that if the particle is massive then 
the geodesic is timelike, while if the particle has mass zero then the geodesic is 
a null geodesic, i.e. lightlikc. 

The second component of the theory is how the distribution of matter de- 
termines the metric. This is given by the Einstein field equations: Matter 
distribution is described by a (possibly degenerate) symmetric bilinear form on 
the tangent space at each point called the stress energy tensor, T. The Einstein 
equations take the form Q = 8irT where Q is related to the Ricci curvature. In 
particular, in empty space, the Einstein equations become Q = 0. 

Although the study of these equations is a huge enterprise, the solution for 
the equations Q = in the exterior of a star of mass M which is "spherically 
symmetric" , "stationary" and tends to the Minkowski metric at large distances 
was found almost immediately by Schwarzschild. (The words in quotes need to 
be more carefully defined.) This is the metric 

ds 2 := -halt 2 + h- x dr 2 + r 2 da 2 (3.47) 

where 

Mr):=l-^ (3.48) 

where G is Newton's gravitational constant and da 2 is the invariant metric on 
the ordinary unit sphere, 

da 2 = d9 2 +sm 2 Ode)) 2 . (3.49) 
To be more precise, let Pj C R 2 consist of those pairs, (t, r) with 



r > 2GM. 

Let 

N = Pi x S* 2 , 

the set of all (t, r,q), r > 2GM, q e S 2 . The coordinates (0, </>) can be used on 
the sphere with the north and south pole removed, and (3.49) is the local ex- 
pression for the invariant metric of the unit sphere in terms of these coordinates. 
Then the metric we are considering on N is given by (3.47) as above. 

Notice that the structure of N is like that of a surface of revolution, with 
the interval on the z— axis replaced by the two dimensional region, N, the circle 
replaced by the sphere, and the radius of revolution, /, replaced by r 2 . I 
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If we set x° := t, x 1 := r, x 2 := 9, x 3 := <f) then 

9ij = 0, i 3 (3-50) 

while 

.900 = ~h,gu = h~ 1 ,g 22 = r 2 ,g 33 = r 2 sm 2 9. 

Recall that a system of coordinates in which a metric satisfies (3.50) is called 
an orthogonal coordinate system. In such a coordinate system we have seen that 
the geodesic equations are 



sK-HE^Kf)- <3 - 51) 



1. Show that for the Schwarzschild metric, (3.47), the equation involving g 22 
on the left is 



d r od0~\ 2 / ' ■ " 

r sin 9 cos v 



ds 



2 d6 
ds 



ds 



Conclude from the uniqueness theorem for solutions of differential equations that 
if 9(0) = 7r/2,0(O) = then 9(s) = ir/2 along the whole geodesic. Conclude 
from rotational invariance that all geodesies must lie in a plane, i.e. by suitable 
choice of poles of the sphere we can arrange that 9 = n/2. 

2. With the above choice of spherical coordinates along the geodesic, show that 
the goo and 333 equations become 

h d ± = E 
ds 

r 2 ^ = L 
ds 

where E and L are constants. These constants are called the "energy" and the 
"angular momentum". Notice that for L > 0, as we shall assume, d<fr/ds > 0, 
so we can use <j> as a parameter on the orbit if we like. 

General principles of mechanics imply that there is a "constant of motion" 
associated to every one parameter group of symmetries of the system. The 
Schwarzschild metric is invariant under time translations t ^ t + c and under 
rotations <j> 1— » (p + a. Under the general principles mentioned above, it turns out 
that E corresponds to time translation and that L corresponds to (f> + a. 

We now consider separately the case of a massive particle where we can 
choose the parameter s so that (7 ; (s), V(s)) = 1 and massless particles for 
which (y(s),V(s)) ee 0. 



88 



CHAPTER 3. LEVI-CIVITA CONNECTIONS. 



3.21.1 Massive particles. 

We can write the tangent vector, j'(s) to the geodesic 7 at the point s as 

VW - *.(•) (j*),,./*'" (sO,,./**' (isO*./*'" (^) 7( , 

Let us assume that we use proper time as the parameterization of our geodesic 
so that 

( 7 '( S ), 7 '( S )) 7(S) =-1. 

vskip.2in 

3. Using this last equation and the results of problem 2, show that 

+ (l + *)*(r) (3.52) 

along any geodesic. 



Orbit Types. 

We can write (3.52) as 

E 2 = [■-) \ (» rr.M 



ds 

where the effective potential V is given as 



2GM L 2 2GM I? 
V (r) :=1- — + -2-— — 



The behavior of the orbit depends on the the relative size of L and GM . In 
particular, (3.53) implies that on any orbit, r is restricted to an interval 

I C {r : V(r) < E 2 } such that r(0) G /. 

If we differentiate (3.53) we get 

* N = - V(D ( %) ■ (3.54) 



ds 2 J \ds J \ds 

In particular, a critical point of V, i.e. a point ro for which V' (ro) = 0, gives rise 
to a circular orbit r = r$. If i? is a non-critical point of V for which V(R) — E 2 , 
then R is a turning point - the orbit reaches the end point R of the interval / 
and then turns around to move along / in the opposite direction. 

Observe that V(2GM) = and V(r) — > 1 as r — » 00. To determine how V 
goes from to 1 on [2GM.oo) we compute 

V'(r) = (GMr 2 - L 2 r + 3GML 2 ) (3.55) 



3.21. GEODESICS OF THE SCHWARTZSCHILD METRIC. 



<S<) 



and the quadratic polynomial in r given by the expression in parenthesis has 
discriminant 

L 2 {L 2 - 12G 2 M 2 ). 

If this discriminant is negative, there are no critical points, so V increase mono- 
tonically form to oo. If this discriminant is positive there are two critical 
points, n < T2- Since V'(2GM) > 0, we see that ri a local maximum and r2 a 
local minimum. (We will ignore the exceptional case of discriminant zero.) In 
the positive discriminant case we must distinguish between the cases where the 
local maximum at r\ is not a global maximum, and when it is. Since V(r) — > 1 
as r — > oo these two cases are distinguished by V(r\) < 1 and V(r\) > 1. 
Ignoring non-generic cases we thus can classify the behavior of r(s) as: 

• I? < \2G 2 M 2 so V has no critical points and hence is monotone increasing 
on the interval [2GM, oo). The behavior of r(s) for s > subdivides 
into four cases, all leading to "crashing" (i.e. reaching the Schwartzschild 
boundary 2GM in finite s) or escape to infinity. The four possibilities 
have to do with the sign of f(0) and whether E 2 < 1 or E 2 > 1. 

1. E 2 < 1, f(0) < 0. Since V decreases as r decreases, (3.53) implies 
that rs, f(0) < for all s > where it is defined. The particle crashes 
into the barrier at 2GM in finite time. 

2. E 2 < 1, r > 0. The orbit initially moves in the direction of increasing 
r, reaches its maximum value where V(r) = E 2 , turns around and 
crashes. 

3. E 2 > 1, r > 0. The particle escapes to infinity. 

4. E 2 > 1, r < 0. The particle crashes. 

• 12G 2 Af 2 < L 2 < 16G 2 M 2 . Here there are two critical points, but the 
maximum value at r\ is < 1. There are now four types of intervals /, 
depending on the value of E: 

1. E 2 < V(ri), r < T\ . Here the interval / lies to the left of the 
local maximum. The behavior will be like the first two cases above - 
"crash" if f(0) < and turn around then crash if f > 0. 

2. E 2 < V(ri), r > r\. the interval I now lies in a well to the right 
of ri, and so the value of r has two turning points corresponding 
to the end points of this interval. In other words the value of r 
is bounded along the entire orbit. We call this a bound orbit. 
In the "non-relativistic" approximation, this corresponds to Kepler's 
ellipses. In problems 4 and 5 below we will examine more closely how 
this approximation works and derive Einstein's famous calculation of 
the advance of the perihelion of Mercury. 

3. V(ri) < E 2 < 1. The interval / is bounded on the right by the curve 
and extends all the way to the left (up to the barrier at 2GM). The 
behavior is again either direct crash or turn around and then crash 
according to the sign of r(0), 
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4. E 2 > 1. Now the possible behaviors are "crash" if f(0) < of escape 
to infinity if f (0) > 0. 

• L > 4M. Now V(ri) > 1. Again there will be four possible intervals: 

1. E 2 < V(ri), r(0) < r\. This is an interval lying to the left of the 
"potential barrier" and so yield either a crash or turn around then 
crash orbit. 

2. 1 < E 2 < V(ri). Now / lies to the right of the barrier, but below 
its peak, extending out to oo on the right. The orbit will escape to 
infinity if f(0) > or turn around and then escape if f (0) < 0. 

3. E 2 > 1. The interval / extends from 2GM to infinity and the orbit 
is either crash or escape depending on the sign of r(0). 

4. V(r 2 ) < E 2 < 1. The interval now lies in a "well" to the right of the 
peak at n. We have again a bound orbit. 

We are interested in the bound orbits with L > 0. According to problem 2 
we can use as a parameter on such an orbit and by the second equation in 
that problem we have 



r := 



dr dr / d<f> 



L dr 

r 2 d<f> 



ds ds I d(f> 

Substituting this and the definition of h into (3.52) we get 



E 2 



^4 



dr 



1 



1/ 



It is now convenient to introduce the variable 

1 

u := - 



1 



2GM \ 



instead of r. We have 



du 
d4> 



so 



E 2 



du 
dd> 



We can rewrite this as 

du^ ' 

dd> 



dr 
~dA> 



, dr 
d4> 



(1 + L 2 u 2 )(l-2GMu). 



2GMQ, Q:=u 3 - 



; u 2 + f3 lU + (3 Q 



2GM 

where (3 and (3\ are constants, combinations of E, L, and GM: 



(3.56) 



(3.57) 
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Perihelion advance. 

We will be interested in the case of bound orbits. In this case, a maximum 
value, Mi along the orbit must be a root of the cubic polynomial, Q, as must 
be a minimum, u 2 , since these are turning points where the left hand side of 
(3.57) vanishes. Notice that these values do not depend on </>, being roots of 
a given polynomial with constant coefficients. Since two of the roots of Q are 
real, so is the third, and all three roots must add up to ^g, the negative of 
the coefficient of u 2 . Thus the third root is 

We thus have 

du \ ^ 1 
—J =2GM(u-«i)(«-«2)(«- ^^ + u i+ u 2)- 

Since the first factor on the right is non-positive and the second non-negative, 
the third is non-positive as the product must equal the non-negative expression 
on the left. Furthermore, we will be interested in the region where r ^> 2GM 
so 

2GM(u + m + u 2 ) < 6GMu! < 1. 
We therefore have the following expressions for \d<p/du\: 

[l-2GM(u + u 1 +u 2 )}~^ (3.58) 

(3.59) 
(3.60) 



du \/(ui — u){u — u 2 ) 

^ l + GM{u + u 1 +u 2 ) 

y/(ux - u)(u - lt 2 ) 
1 

^(ui - u)(u - u 2 ) 



Here (3.59) is obtained from (3.58) by ignoring terms which are quadratic or 
higher in 2GM(u+u\ +u 2 ) and (3.60) is obtained from (3.58) by ignoring terms 
which are linear in 2GM{u + U\ + u 2 ). 

The strategy now is to observe that (3.60) is really the equation of an el- 
lipse, whose Appolonian parameters, the latus rectum and the eccentricity, are 
expressed in terms of u\ and u 2 . Then (3.59) is used to approximate the advance 
in the perihelion of Keplerian motion associate to this ellipse. 

4. Show that the ellipse 

u = - (1 + ecos <j)) 
is a solution of (3.60) where e and I are determined from 

U!=j{l + e), u 2 = j(l-e) 
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so that the mean distance 

_ 1 / l_ 1\ I 
a: ~ 2 { Ul + u 2 ) ~ I-**' 

This is the approximating ellipse with the same maximum and minimum dis- 
tance to the sun as the true orbit, if we choose our angular coordinate <f) so that 
the x— axis is aligned with the axis of the ellipse. 

In principle (3.58) is in solved form; if we integrate the right hand side from 
u\ to u 2 and then back again, we will get the total change in </> across a complete 
cycle. Instead, we will approximate this integral by replacing (3.58) by (3.59) 
and then also make the approximate change of variables u — ^ _1 (1 + ecos</>). 

5. By making these approximations and substitutions show that the integral 
becomes 

/ [1 + GM£-\3 + cos 4>}d(f> = 2tt+ ^ GM 
Jo " 

so the perihelion advance in one revolution is 

6irGM 
a(l-e 2 )' 

We have done these computations in units where the speed of light is one. 
If we are given the various constants in conventional units, say 

G = 6.67 x 1CT n m 3 /kg sec, 

and the mass of the sun in kilograms 

M = 1.99 x 10 30 kg 

we must replace G by G/c 2 where c is the speed of light, c = 3 x 10 8 m/sec. 
Then 2GM/c 2 = 1.5km. We may divide by the period of the planet to get the 
rate of advance as 

QttGM 
c a (l - e 2 )T' 

If we substitute, for Mercury, the mean distance a — 5.768 x 10 10 m, eccentricity 
e = 0.206 and period T — 88 days, and use the conversions 

century = 36524 days 
radian = [360/27r] degrees 
degree = 3600" 

we get the famous value of 43.1" /century for the advance of the perihelion of 
Mercury. This advance had been observed in the middle of the last century. 
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Up until recently, this observational verification of general relativity was not 
conclusive. The reason is that Newton's theory is based on the assumption that 
the mass of the sun is concentrated at a point. A famous theorem of Newton 
says that the attraction due to a homogeneous ball (on a particle outside ) is 
the same as if all the mass is concentrated at a point. But if the sun is not a 
perfect sphere, or if its mass is not uniformly distributed, one would expect some 
deviation from Kepler's laws. The small effect of the advance of the perihelion 
of Mercury might have an explanation in terms of Newtonian mechanics. In 
the recent years, measurements from pulsars indicate large perihelion advances 
of the order of degrees per year (instead of arc seconds per century) yielding a 
striking confirmation of Einstein's theory. 

3.21.2 Massless particles. 

We now have 

(— \ +ii(s)^^ T ^ +x 2 (s) ( +.r;(Vi (— \ 

{i{s),i{s)) l(s) =Q. 



VM = *<•) M 7(s) +il(s) M 7(s) +i2(s) Ke*)^ 



6. Using problem 2 verify that 



dr\ 2 (I? 



and then 

d<t> 



, u = 3GMu\ (3.61) 



We will be interested in orbits which go out to infinity in both directions. For 
large values of r, the right hand side is negligibly small, so we should compare 
(3.61) with 

d 2 u 

+ u = Q 



whose solutions are 



or 



d<j) 2 

u = a cos <f> + b sin 
1 = ax + by, x — r cos <f), y = r sin < 



in other words straight lines. We might as well choose our angular coordinate 
<p so that this straight line is parallel to the y— axis, i.e. 

i*o = Tq 1 cos <j> 
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where ro is the distance of closest approach to the origin. Suppose we are 
interested in light rays passing the sun. The radius of the sun is about 7 x 10 5 
km while 2GM is about 1.5km. Hence in units where r is of order 1 the 
expression 3GM is a very small quantity, call it e. So write our equation as 

u" + u = eu 2 , e = 3GM. (3.62) 

We solve this by the method of perturbation theory: look for a solution of the 
form 

u = uq + ev + ■ ■ ■ 

where the error is of order e 2 . We choose Uo as above to solve the equation 
obtained by equating the zero-th order terms in e. 

7. Compare coefficients of e to obtain the equation 



v" + v= -^(1 + cos 20) 
2r 



and try a solution of the form v = a+b cos 20 to find the solution of this equation 
and so obtain the first order approximation 



u = — cos — cos 2 + (3.63) 



1 e 2 , 2e 

— cos - — -k cos <p + — -„ 
r 3r 2 3r 2 

to (3.62). 



The asymptotes as r ^ oo or ii ^ will be straight lines with angles 
obtained by setting u = in (3.63). This gives a quadratic equation for cos (p. 

8. Remembering that cosine must be < 1 show that up through order e we have 

2e 2GM 

COS(/> = -— = . 

3r r 



Writing = n/2 + S this gives sin S — 2GM/r or approximately S = 
2GM/ro. This was for one asymptote. The same calculation gives the same 
result for the other asymptote. Adding the two and passing to conventional 
units gives 

. 4GM . 
A - -J— (3.64) 
c 2 r 

for the deflection. For light just grazing the sun this predicts a deflection of 
1.75". This was approximately observed in the expedition to the solar eclipse 
of 1919. 

Recent, remarkable, photographs from the Hubble space telescope have given 
strong confirmation to Einstein's theory from the deflection of light by dark 
matter. 



Chapter 4 

The bundle of frames. 



4.1 Connection and curvature forms in a frame 
field. 

Let E — (Ei, . . . , E n ) be a(n orthonormal) frame field and 

/ o 1 \ 



V 8" J 

the dual frame field so 

id = e 1 ® e 1 + ... + E n ® e n 

or 

id = (Ei, . . . ,E n ) 

where id is the tautological tensor field which assigns the identity map to each 
tangent space. We write this more succinctly as 

id = EO. 

The (matrix of) connection form(s) in terms of the frame field is then determined 

by 

de + u a e = o 

and the curvature by 

du> + u> A u> = Ct. 

We now repeat an argument that we gave when discussing the general Maurer 
Cartan form: Recall that for any two form r and a pair of vector fields X and 
Y we write t(X, Y) = i(Y)i(X)r. Thus 

(w A oj)(X, Y) = uj(X)lu(Y) - uj(Y)uj(X), 
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the commutator of the two matrix valued functions, oj(X) and oj{Y). Consider 
the commutator of two matrix valued one forms, to and a, 

(according to our usual rules of superalgebra) . We denote this by 

[wA, a]. 

In particular we may take u> = a to obtain 

[wA, v] — 2lo A ui. 
We can thus also write the curvature as 

Q = duj + ^[^A, UJ ]- 

This way of writing the curvature has useful generalizations when we want to 
study connections on principal bundles later on in this chapter. 

4.2 Change of frame field. 

Suppose that E' is a second frame field whose domain of definition overlaps with 
the domain of definition of E. On the intersection of their domains of definition 
we must have 

E' = EC 

is another frame field where C is a(n orthogonal) matrix valued function. Let 
8' be the dual frame field of E' . On the common domain of definition we have 

EC9' = E'Q' = id = E0 

so 

e = ce'. 

Let to' be the connection form associated to 8', so lo' is determined (using 
Cartan's lemma ) by the anti-symmetry condition and 

dd' + to' A 8' = 0. 

Then 

d8 = d{C8') = dCA8' + Cd8' = dCC- 1 A 8 - Clo'C' 1 A 8 
implying that 

uj = -dCC' 1 + Cto'C- 1 

or 

J = C- 1 ujC + C- 1 dC. (4.1) 
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Wc have 

u'Au' = C _1 w A luC + C~ x uj AdC + C^dCC^ 1 A luC + C^dC A C^dC 
while 

dJ = diC- 1 ) A bjC + C-^duC - C7 _1 w A + c^C" 1 ) A dC. 
Now it follows from 

c- x c = i 

that 

^(c- 1 ) = -c^dec- 1 

and hence from the expression 

fl' = J A J + dw' 

we get 

0' = C^flC. (4.2) 

Notice that this equation contains the assertion that the curvature is a tensor. 
Indeed, recall that for any pair of tangent vectors £, r\ e TM p the matrix fi(£, 77) 
gives the matrix of the operator R^ v : TM p — > TM p relative to the orthonormal 
basis Ei(p), . . . , E n (p). Let ( € TM p be a tangent vector at p and let z % be the 
coordinates of £ relative to this basis so £ = + • • • z".E n which we can write 
as 

( Zl \ 

C = E(p)z where z — ■ 

\z" ) 

Then 

R (n C = E(pMCv)z. 

If we use a different frame field E' = EC then £ = E'(p)z' where z' = C~ 1 (p)z. 
Equation (4.2) implies that 

Q , (tv)z' = C-\p)Q(^r ] )z 

which shows that 

E'^Cl'^^z' = E(p)Cl^,r ] )z. 

Thus the transformation ( \— ► E(p)fl(^ 7 r])z is a well defined linear transforma- 
tion. So if we did not yet know that R^ v is a well defined linear transformation, 
we could conclude this fact from (4.2). 
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4.3 The bundle of frames. 

We will now make a reinterpretation of the arguments of the preceding section 
which will have far reaching consequences. Let O(M) denote the set of all 
"orthonormal" bases of all TM p . So a point, £, of O(M) is an "orthonormal" 
basis of TM p for some point p e M, and we will denote this point by n(£). So 

tt : O(M) —> M, £ is an o.n. basis of TM v[£) 

assigns to each £ the point at which it is the orthonormal basis. 

Suppose that E is a frame field defined on an open set U C M. If p e U, 
and 7r(£) = p, then there is a unique "orthogonal" matrix A such that 

£ = E(p)A. 

We will denote this matrix A by (j>{£). (If we want to make the dependence on 
the frame field explicit, we will write (f>E instead of 0.) Thus 

£ = E(n(£))<f>(E). 

This gives an identification 

V> : 7r _1 (C0 -^UxG, = (tt(£), 0(£)) (4.3) 

where G denotes the group of all "orthogonal" matrices. It follows from the 
definition that 

4>{£B) = 4>{£)B, VB e G. (4.4) 
Let E' be a second frame field defined on an open set U' . We have a map 

C : U n U' -» G 

such that 

= SG 

as in the last section. Thus 

£ = E<t> E {£) =EC(n(£))4> E >{£) 

so 

= G O 7T. (4.5) 

This shows that the identifications given by (4.3) define, in a consistent way, a 
manifold structure on O(M). The manifold O(M) together with the action of 
the "orthogonal group" G by "multiplication on the right" 

R A ■ £ ^ £ o A' 1 

and the differentiable map tt : O(M) — > M is called the bundle of (orthonor- 
mal) frames. 

We will now define forms 

/ & \ 
: 

on O(M): 



and u) — {ijj l A 
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4.3.1 The form 

Let £ e T(0(M)) £ be a tangent vector at the point £ e O(M). Then d-K £ (£) is 
a tangent vector to M at the point n(£): 



As such, the vector dir £ (£) has coordinates relative to the basis, £ of TM n ^ 
and these coordinates depend linearly on £. So we may write 



defining the forms #\ As usual, we write this more succinctly as 

d-n = £&. 

4.3.2 The form -d in terms of a frame field. 

Let v € T(0(M)) £ be a tangent vector at the point £ e O(M). Assume that 
7r(£) lies in the domain of definition of a frame field E and that £ = E(p)A 
where p = tt(£). Let us write dir(v) instead of dir £ (v) so as not to overburden 
the notation. We have 



dn{Oe € TM <£) . 



rf7T£(0=^(0^i + ---^(0^ 



n 



dn(v) = E( P )6(dw(v)) = £ti(v) = E(p)A$(v) 



so 



A&(v) = 9(dn(v)). 



We can write this as 



(4.6) 



where 



A^O 



is the one form defined on U x G by 



A- 1 9( V + () = A- 1 



(6(7])), n&TM x , (eTG A . 



Here we have made the standard identification of T(U x G)i x< a) a s a direct sum, 



T(UxG) {xA) ~TM x ®TG A , 



valid on any product space. 



4.3.3 The definition of oJ. 

Next we will define To in terms of the identification 
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given by a local frame field, and check that it satisfies 

d-d + uj A ■& — 0, eiUj = -ejhj\ . 

By Cartan's lemma, this uniquely determines UJ, so the definition must be inde- 
pendent of the choice of frame field, and so UJ is globally defined on O(M). 

Let uj be the connection form (of the Levi-Civita connection) of the frame 
field E. 

Define 

uj := ij* [A- x ujA + A^dA] (4.7) 

where the expression in brackets on the right is a matrix valued one form defined 
on U x G. Then on U x G we have 

dlA^O] = -A- 1 dAAA~ 1 6 + A^dO 

= —A _1 dA A A _1 6 — A~ 1 ujA A A~ x so 
= d[A~ x e\ + [A~ x ujA + A~ x dA\ A A~ 1 9. 

Applying tp* yields 

d-d + uj A = 0. 

as desired. The antisymmetry condition says that u> takes values in the Lie 
algebra of G. Hence so does AioA~ x for any A e G. We also know that A~ x dA 
takes values in the Lie algebra of G. Hence so does UJ. 

4.4 The connection form in a frame field as a 
pull-back. 

We now have a reinterpretation of the connection form, uj, associated to a frame 
field. Indeed, the form UJ is a matrix valued linear differential form defined on 
all of O(M). A frame field, E, defined on an open set U, can be thought of as 
a map, x E(x) from U to O(M): 

E:U^O(M), x^E(x). 

Then the pull-back of UJ under this map is exactly uj, the connection form asso- 
ciated to the frame field! In symbols 

E*uj = uj. 

To see this, observe that under the map ip : 7r~ 1 ([7) — > UxG, we have ip(E{x)) = 
(x,I) where I is the identity matrix. Thus 

ifjoE= (id, I) 

where id: U — > U is the identity map and I means the constant map sending 
every point x into the identity matrix. By the chain rule 

E*UJ = E*-ip* [A~ 1 ujA + A^dA] 

= (tjj o E)* [A~ 1 ujA + A _1 dA\ 

— UJ. 
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Thus, for example, the frame field E is parallel relative to a vector field, X 
on M if and only if Vx (E) = which is the same as 

i(X)uj = 

where ui is the connection form of the frame field. In view of the preceding 
result this is the same as 

i[dE(X)]oJ = 0. 

Here dE(X) denotes the vector field along the map E : U — » O(M) which 
assigns to each x e U the vector dE x (X(x)). 

Let me repeat this important point in a slightly different version. Suppose 
that C : [0, 1] — > M is a curve on M, and we start with an initial frame £(0) at 
C(0). We know that there is a unique curve 1 1— > in O(M) which gives the 
parallel transport of £(0) along the curve C. We have "lifted" the curve C on 
M to the curve 7 : 1 1— > £(t) on £(M). The curve 7 is completely determined 

by 

• its initial value 7(0), 

• the fact that it is a lift of C, i.e. that 7r(7(t)) = C(t) for all t, and 

i( 7 '(i))w) = 0. (4.8) 

We now want to describe two important properties of the form TD. For B g G, 
recall that tb denotes the transformation 

r B :0{M)^0{M), £ ^ £B^ . 

We will use the same letter, tb to denote the transformation 

r B : U x U x G, ^ (^ir 1 ). 

Because of (4.4), we may use this ambiguous notation since 

ip o tb = rs o -0. 

It then follows from the local definition (4.7) that 

r B Lo = BujB- 1 . (4.9) 

Indeed 

r* b {w) =r£V* [A^ujA + A^dA] =ip*r* B [A~ 1 uA + A^dA] 

and 

r^(i4 _1 Wi4) = B(A~ 1 (jA)B~ 1 
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since u> does not depend on G and 

r%(A~ 1 dA) = B(A- 1 dA)B- 1 . 

We can write (4.9) as 

r* B u; = Ad b {uj). (4.10) 

For the second property, we introduce some notation. Let £ be a matrix 
which is "antisymmetric" in the sense that 

<£j = ~<M- 
This implies that the one parameter group 

1 1 ^ cxp-te = i - % + l -t 2 e - +■■■ 

lies in our group G for all t. Then the one parameter group of transformations 

r eKp - t f.O(M)^0(M) 

has as its infinitesimal generator a vector field, which we shall denote by X%. 
The one parameter group of transformations 

'exp — 1£ 

:U xG^UxG 

also has an infinitesimal generator: Identifying the tangent space to the space 
of matrices with the space of matrices, we see that the vector field generating 
this one parameter group of transformations of U x G is 

r £ : (x,A) i ► A£. 

So the vector field takes values at each point in the TG component of the 
tangent space to U x G and assigns to each point (x, A) the matrix A£. In 
particular u>(Y^) = since u is only sensitive to the TU component. Also dA 
is by definition the tautological matrix valued differential form which assigns to 
any tangent vector Z the matrix Z. Hence 

A-'dAiY^) = £• 

From 

r B o tp = ip o r B 

it follows that 
and hence that 

w(*£)=£. (4.11) 

Finally, the curvature form from the point of view of the bundle of frames is 
given as usual as 

H := dcO+ ^pA,U)]. (4.12) 
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4.5 Gauss' theorems. 

We pause with this section to go back to classical differential geometry using 
the language we have developed so far. 

4.5.1 Equations of structure of Euclidean space. 

Suppose we take M = R™ with its standard Euclidean scalar product. The 
Levi-Civita connection is then derived from the identification of the tangent 
space at every point with R" itself - a vector field becomes identified with an 
R™ valued function which we can then differentiate. A point of 0(TL n ) can then 
be written as (x, £\, . . . , £ n ) where x £ R™ and £i £ R™ with £\, . . . ,£ n forming 
an orthonormal basis of R™ . We then have 

n(x,£-i, ...,£ n ) = x 

and 

■d 1 - (dx,£i), 

the right hand side being the scalar product of the vector valued differential 
form, dx and the vector valued function £i. We have 

dx = £■&. 

Differentiating this equation gives 

= d£ A d + £d$. 

We have 

d£j = where cU* := (d£j,£i) 

or 

d£ = £lJ. 

We thus get 

dd + uj A ■& = 

showing that uj is indeed the connection form. Taking the exterior derivative of 
the equation d£ = £ A uj gives 

duJ + uj A ZJ = 

showing that the curvature does indeed vanish. To summarize, the equations of 
structure of Euclidean geometry are 





■ = (dx,£i) 


(4.13) 




:= (d£j,£i) 


(4.14) 




= -uj 3 


(4.15) 


dx 


= £d 


(4.16) 


d£ 


= £uj 


(4.17) 


d-d + uj A d 


= 


(4.18) 


duJ + uj A uj 


= 0. 


(4.19) 
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4.5.2 Equations of structure of a surface in R 3 . 

We specialize to n = 3. Let 5 be a surface in R 3 . This picks out a three 
dimensional submanifold of the six dimensional 0(R 3 , call it J-'(S) consisting 
of all {x 1 £i,£ 2 ,£^) where 

xeS 

and 

£ i and £2 are tangent to S. 

Of course this implies that £3 is normal to S. We will use a subscript S to denote 
the pull back of all functions and forms from 0(R 3 ) to T{S). For example, 
the vector valued differential form dxs takes values in the tangent space TS X 
regarded as a subspace of R 3 . Hence d s = 0. The set of all (xs, £is, £25) 
constitutes O(S), the bundle of frames of S thought of as a two dimensional 
Riemann manifold. Since £ 3 s is determined up to a ± sign by the point xg, we 
can think of F(S) as a two fold cover of O(S). [From a local point of view we 
can always may a choice of the sign, and also from the global point of view if 
the surface is orientablc.] 

From the equations of structure of Euclidean space we obtain 



dx s = PsSis + tishs (4.20) 

d£ 3S = ljI s £ 1s + uj\ s £ 25 (4.21) 

d§ s +uj\ s Afi 2 s = (4.22) 

d$ 2 s -uls Ati s = (4.23) 

du\ s + uj\ s Mol s = (4.24) 

the last equation following from Za\ — —uj\ and uj\ = 0. 



Equations (4.22) and (4.23) show that uj\ s and u\ s = —Tv\ s are the connec- 
tion forms of O(S) if we (locally) identify it with F(S). In particular, UjI s is 
intrinsically defined - it gives the Levi-Civita connection of the induced Riemann 
metric on S. 

4.5.3 Theorema egregium. 

Equation (4.21) shows that 

4a4 = ^A^ s (4.25) 

where K is the Gaussian curvature. Gauss's theorema egregium now follows 
immediately from (4.24). 

4.5.4 Holonomy. 

Let S be any two dimensional Riemann manifold (not necessarily embedded in 
three space). The connection matrix is a two by two antisymmetric matrix 









' 






, 


)-( 
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Let E = (Ei 7 E 2 ) be a frame field on S, and let 

uj\ = E*uj\ s 

be the corresponding form on S. Let 7 be a curve on S lying entirely in the 
domain of definition of the frame field, and let t 1— > v(t) G TS 7 ( t ) be a field of 
unit vectors along the curve. We can write 

v(t) = cos^) J B 1 (7(i)) + smcj>(t)E 2 ( 1 (t)) 

where (f>(t) is the angle that the unit vector v(t) makes with the first basis vector, 
£i(70)), of the frame at 7 (i). Then 

v' = -<f>' sin cjyEx+ivK-f') cos <j)E 2 + (f> / cos (f}E 2 + Ljl(j / ) Ex 
= (0 / -^(7 / ))[-sin^ 1 + cos^ 2 ]. 

In particular, v is parallel along 7 if and only if 

^(^wKVW). (4-26) 




(4.27) 



gives the change in of a parallel vector field along 7. Of course the angle is 
relative to a choice of frame field, and so has no intrinsic meaning. But suppose 
that 7 is a closed curve, so [4>] measures the rotation involved in transporting a 
tangent vector all the away around the curve back to the starting point. This 
is well defined, independent of the frame field, and (4.27) is valid for any closed 
curve on the surface. In particular, suppose that 

7 = 3D 

i.e. suppose that 7 is the boundary curve of some oriented two dimensional re- 
gion. We then may apply Stokes' theorem and (4.25) to conclude that 

[0] = f KdA. (4.28) 

J D 

4.5.5 Gauss-Bonnet. 

Suppose that D is contained in the domain of a frame field, say a frame field 
obtained by orthonormalizing the basic fields of a coordinate patch, to fix the 
ideas. Let tp denote the angle that the vector field makes with 7' rather than 
with Ei. The tangent vector 7' turns through an angle of 2ir relative to the 
frame field as we traverse the curve, (this requires some proof in general, but 
is obvious if D is convex in some coordinate chart, since then the angle that 7' 
makes with the x\— axis is steadily increasing. So we can restrict to this case 
to avoid calling in additional arguments.) Thus 



M = [4>] - 2tt. 
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If 7 is only picccwisc diffcrcntiablc, like the boundary of a polygon, then the 
change in tp will come from two sources, the contribution of the smooth portions 
and the exterior angles at the corners. So we can write 

[ip] — [edge contributions] — exterior angles. 

We get 

2tt = KdA + \_] exterior angles — [edge contributions] . 
Jd 

Now suppose we subdivide the surface into such "polygonal" regions, and sum 
the preceding equation over all regions. The edge contributions will cancel, since 
each edge will contribute twice, traversed in opposite directions. Thus 




exterior angles 



where / is the number of regions, D, or "faces" . Now we can write each exterior 
angle as 

it — interior angle. 

The sum of all the interior angles at each corner from the regions impinging on 
it add up to 2n. Each edge contributes to two corners. So if we let e denote 
the number of edges and v the number of "vertices" or corners we obtain the 
Gauss-Bonnet formula 

f - e + v = — ( KdA. (4.29) 

27T J s 

The amazing property of this formula is that the left hand side does not depend 
on the choice of metric, while the right hand side does not depend on the choice 
of subdivision (and is not obviously an integer on the face of it). So we obtain 
Eulcr's theorem that / — e + v is independent of the choice of subdivision, and 
also that the integral of the curvature is independent of the choice of metric, 
and is an integer equal to the Eulcr number f — e + v. 



Chapter 5 

Connections on principal 
bundles. 

According to the current "standard model" of elementary particle physics, every 
fundamental force is associated with a kind of curvature. But the curvatures 
involved are not only the geometric curvatures of space-time, but curvatures 
associated with the notion of a connection on a geometrical object (a "princi- 
pal bundle") which is a generalization of the bundle of frames studied in the 
preceding chapter. We develop the necessary geometrical facts in this chapter. 

5.1 Submersions, fibrations, and connections. 

A smooth map it : Y — > X is called a submersion if dir y : TY y —> TX^iy) 
is surjective for every y e Y. Suppose that X is n-dimcnsional and that Y is 
n + k dimensional. The implicit function theorem implies the following for a 
submersion: 

If tt : Y — > X is a submersion, then about any y £Y there exist coordinates 
z 1 , . . . , z n ; y 1 , . . . , y (such that y has coordinates (0, . . . , 0; . . . , 0)) and coor- 
dinates x ,. . . ,x n about ir(y) such that in terms of these coordinates tt is given 
by 

n(z 1 ,...,z n ;y\...,y k ) = (z 1 ,...,z n ). 

In other words, locally in Y, a submersion looks like the standard projection 
from R n+k to R" near the origin. For the rest of this section we will let 
7r : Y — > X denote a submersion. 

For each y S Y we define the vertical subspace Vert y of the tangent space 
TY y to consist of those n € TY y such that 

dTT y (ri) = 0. 
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In terms of the local description, the vertical subspace at any point in the 
coordinate neighborhood of y given above is spanned by the values of the vector 
fields 

d d 
dy 1 ' ' dy k 

at the point in question. This shows that the Verty fit together to form a smooth 
sub-bundle, call it Vert , of the tangent bundle TY. 

A general connection on the given submersion is a choice of complemen- 
tary subbundlc Hor to Vert. This means that at each y <E Y we are given a 
subspace Hor^ C TY y such that 

Vert,, 8 Holy = TY y 

and that the Hor y fit together smoothly to form a sub-bundle of TY. It follows 
from the definition that Hor^ has the same dimension as TX n / y \ and, in fact, 
that the restriction of diTy to Hor y is an isomorphism of Hor y with TX v ( y y We 
should emphasize that the vertical bundle Vert comes along with the notion of 
the submersion it. A connection Hor, on the other hand, is an additional piece 
of geometrical data above and beyond the submersion itself. 

Let us describe a connection in terms of the local coordinates given above. 
The local coordinates x 1 , . . . , x n on X give rise to the vector fields 

d d 
dx 1 ' ' dx n 

which form a basis of the tangent spaces to X at every point in the coordinate 
neighborhood on X. Since dir restricted to Hor is a bijection at every point of 
Y, we conclude that there are functions a ri , r = 1, ... ft, i = l,...n on the 
coordinate neighborhood on Y such that 

JL sr JL d sr 9 
+ 2^ ari a7' • • • ' + 2^ 

r=l y r=l y 

span Hor at every point of the neighborhood. 

Let C : [0, 1] — > X be a smooth curve on X. We say that a smooth curve 7 
on Y is a horizontal lift of C if 

• 7r o 7 = C and 

• 7 ; (t) G Hor 7 ( t ) for all t. 

For the first condition to hold, each point C(t) must lie in the image of 7r. 
(The condition of being a submersion does not imply, without some additional 
hypotheses, that ir is surjective.) Let us examine the second condition in terms 
of our local coordinate description. Suppose that x — C(0), that x = n(y), and 
we look for a horizontal lift with 7(0) = y. We can write 



C(t) = (x\t),...,x n (t)) 
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in terms of the local coordinate system on X. So if 7 is any lift (horizontal or 
not) of C, we have 

1 (t) = (x 1 (t),...,x n (t);y 1 (t),...,y k (t)) 
in terms of the local coordinate system. For 7 to be horizontal, we must have 

i=l r i y 

Thus the condition that 7 be a horizontal lift amount to the system of ordinary 
differential equations 

^ = E «ri(^(*). • • -.*"(*); ■ ■ • . v\t)V\t) 

r 

where the x % and x 1 ' are given functions of t. This is a system of (possibly) non- 
linear ordinary differential equations. The existence and uniqueness theorem 
for ordinary differential equations says that for a given initial condition 7(0) 
there is some e > for which there exists a unique solution of this system 
of differential equations for < t < e. Standard examples in the theory of 
differential equations show that the solutions can "blow up" in a finite amount 
of time; that in general one can not conclude the existence of the horizontal lift 
7 over the entire interval of definition of the curve C . 

In the case of linear differential equations, we do have existence for all time, 
and therefore in the case of linear connections, or the connection that we studied 
on the bundle of orthogonal frames, there was global lifting. 

We will now impose some restrictive conditions. We will say that the map 
tt : Y — > X is a locally trivial fibration if there exists a manifold F such that 
every x e X has a neighborhood U such that there exists a diffeomorphism 

such that 

7Tl O 1p = TT 

where 

TTi : U xF^U 

is projection onto the first factor. The implicit function theorem asserts that 
a submersion tt : Y — > X looks like a projection onto a first factor locally in 
Y. The more restrictive condition of being a fibration requires that tt look like 
projection onto the first factor locally on X, with a second factor F which is 
fixed up to a diffeomorphism. If the map tt : Y — > X is a surjective submersion 
and is proper (meaning that the inverse image of a compact set is compact) 
then we shall prove below that tt is a fibration if X is connected. 

A second condition that we will impose is on the connection Hor . We will 
assume that every smooth curve C has a global horizontal lift 7. We saw that 
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this is the case when local coordinates can be chosen so that the equations for 
the lifting are linear, we shall see that it is also true when tt is proper. But let 
us take this global lifting condition as a hypothesis for the moment. 

Let C : [a,b] :— > X by a smooth curve. For any y S 7r~ 1 (C(a)) we have a 
unique lifting 7 : [a, b] — > Y with 7(a) = y, and this lifting depends smoothly 
on y by the smooth dependence of solutions of differential equations on initial 
conditions. We thus have a smooth diffeomorphism associated with any smooth 
curve C : [a, b] — > X sending 

n-\C(a))^n-\C(b)). 

If c e [a, b] if follows from the definition (and the existence and uniqueness 
theorem for differential equations) that the composite of the map 

K-\C(a))^n-\C(c)) 
associated with the restriction of C to [a, c] with the map 

Tr- 1 ^))^- 1 ^)) 
associated with the restriction of the curve C to [c, b] is exactly the map 

K-\C{a))^K-\C{b)) 

above. This then allows us to define a map ir~ 1 (C(a)) — ► 7r _1 (C(&)) associated 
to any piecewise differentiable curve, and the diffeomorphism associated to the 
concatenation of two curves which form a piecewise differentiable curve is the 
composite diffeomorphism. 

Suppose that X has a smooth retraction to a point. This means that there 
is a smooth map </> : [0,1] x X — > X satisfying the following conditions where 

<h ■ X - X 

denotes the map 

<f>t(x) = (j>{t,x) 
as usual. Here are the conditions: 

• 0o = id. 

• 4>i{x) — xq, a fixed point of X. 

• (j>t(xo) = xq for all t G [0, 1]. 

Suppose also that the submersion n : Y — > X is surjective and has a connection 
with global lifting. We claim that this implies that that the submersion is a 
trivial fibration; that there is a manifold F and a diffeomorphism 

$:7^IxF with m o $ = 7r 
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where 7Ti is projection onto the first factor. Indeed, take 

F = ir-\x ). 

For each x € X define 

to be given by the lifting of the curve 

1 1 ^ ^ t (a;). 

Then define 

The fact that <f> is a diffcomorphism follows from the fact that we can construct 
the inverse of <I> by doing the lifting in the opposite direction on each of the above 
curves. Every point on a manifold has a neighborhood which is diffeomorphic 
to a ball around the origin in Euclidean space. Such a ball is retractible to the 
origin by shrinking along radial lines. This proves that any surjective submersion 
which has a connection with a global lifting is locally trivial, i.e. is a fibration. 

For any submersion we can always construct a connection. Simply put a 
Ricmann metric on Y and let Hor be the orthogonal complement to Vert relative 
to this metric. 

So to prove that if 7r : Y — > X is a surjective submersion which is proper 
then it is a fibration, it is more than enough to prove that every connection has 
the global lifting property in this case. So let C : [0, 1] — > X be a smooth curve. 
Extend C so it is defined on some slightly larger interval, say [—a, 1 + a], a > 0. 
For any y G 7r _1 (C(f)), t £ [0, 1] we can find a neighborhood U y and an e > 
such that the liftng of C(s) exists for all z £ U v and t — e < s < t+e. This is what 
the local existence theorem for differential equations gives. But C([0, 1]) is a 
compact subset of X, and hence 7r _1 (C([0, 1]) is compact since n is proper. This 
means that we can cover 7r _1 (C([0, 1]) by finitely many such neighborhoods, and 
hence choose a fixed e > that will work for all y £ 7r _1 (C([0, 1]). But this 
clearly implies that we have global lifting, since we can do the lifting piecemeal 
over intervals of length less than e and patch the local liftings together. 

5.2 Principal bundles and invariant connections. 
5.2.1 Principal bundles. 

Let G be a Lie group with Lie algebra g. Let P be a space on which G acts. To 
tie in with our earlier notation, and also for later convenience, we will denote 
this action by 

(p,a) t-^> pa^ 1 , p G P, a£G 
so a € G acts on P by a diffcomorphism that we will denote by r a : 

r a :P->P, r a (p)=pa~ 1 . 
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If £ € £J, then exp(— 1£) is a one parameter subgroup of G, and hence 

fexp(-i£) 

is a one parameter group of diffeomorphisms of P, and for each p E P, the curve 

rcxpt-iOP = p(exp i£) 

is a smooth curve starting at t at t = 0. The tangent vector to this curve at 
t = is a tangent vector to P at p. In this way we get a linear map 

u p : g -> TP p , u p (£) = ^p(expi£)| t=0 . (5.1) 

For example, if we take P = G with G acting on itself by right multiplication, 
and if we assumed that G is a subgroup of Gl(n), so that we may identify TP p 
as a subspace of the the space of all n x n matrices, then we have seen that 

u P (0 =p£, 

where the meaning of p£ on the right hand side is the product of the matrix p 
with the matrix £. For this case, if r a (p) = p for some peP, the a = e, the 
identity clement. 

In general, we say that the group action of G on P is free if no point of P 
is fixed by any element of G other than the identity. So "free" means that if 
r a(p) = P for some p e G then a = e. Clearly, if the action is free, then the map 
u p is injective for all p € P. 

If we have an action of G on P and on Q, then we automatically get an 
action of G (diagonally) on P x Q, and if the action of P is free then so is the 
action on P x Q. 

For example (to change the notation slightly), if X is a space on which G 
acts trivially, and if we let G act on itself by right multiplication, then we get 
a free action of G on X x G. This is what we encountered when we began to 
construct the manifold structure on the bundle of orthogonal frames out of a 
local frame field. We now generalize this construction: 

If we are given an action of G on P we have a projection tt : P — > P/G which 
sends each p e P to its G-orbit. We make the following assumptions: 

• The action of G on P is free. 

• The space P/G is a differentiable manifold M and the projection 7r : P — > 
M is a smooth fibration. 

• The fibration 7r is locally trivial consistent with the G action in the sense 
that every m e M has a neighborhood U such that there exists a diffeo- 
morphism 

V>[7 TT-^t/) -> [/ X G 

such that 

7Tl O ?/> = 7T 
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where 

7Ti : U x F -^U 
is projection onto the first factor and if tp(p) — (m, 6) then 

i>(r a p) = (m,&a _1 ). 

When all this happens, we say that w : P — > M is a principal fiber bundle 
over M with structure group G. 

Suppose that ir : P — > M is a principal fiber bundle with structure group 
G. Since n is a submersion, we have the sub-bundle Vert of the tangent bundle 
TP, and from its construction, the subspace Vert p C TP p is spanned by the 
tangents to the curves p(expi£), (eg. In other words, u p is a surjective map 
from q to Vertp. Since the action of G on P is free, we know that u p is injective. 
Putting these two facts together we conclude that 

Proposition 6 If it : P — > M is a principal fiber bundle with structure group 
G then u p is an isomorphism of Q with Vert p for every p € P. 

Let us compare the isomorphism u p with the isomorphism u rfc (p) = Upb- 1 - 
The action of b g G on P preserves the fibration and hence 

d(r b ) p : Vertp -> Vert p( ,-i. 

Let v = u p (€) € Vertp. This means that 

d , . 
v = —{pexpt£) t=0 . 

By definition 

d(r b ) p v = j t (r 6 (pexpiO)| t=0 = ^((pexp^)6 _1 )|t=o- 

We have 

piexpt^b' 1 = pb-^bicxpt^b- 1 ) 
= p6 _1 exptAdb( 

where Ad is the conjugation, or adjoint, action of G on its Lie algebra. We have 
thus shown that 

d(r b )pU p (t) = u rb(p) {Ad b £). (5.2) 
5.2.2 Connections on principal bundles. 

Let 7r : P — > M be a principal bundle with structure group G. Recall that in 
the general setting, we defined a (general) connection to be a sub-bundle Hor 
of the tangent bundle TP which is complementary to the vertical sub-bundle 
Vert. Given the group action of G, we can demand that Hor be invariant under 
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G. So by a connection on a principal bundle we will mean a sub-bundle 
Hor of the tangent bundle such that 

TP p = Vertp © Hor p at all p e P 

and 

d{r b ) p (Hor p ) = Hor r(;(p) V b e G, p e P. (5.3) 
At any p we can define the projection 

V p : TP p -> Vertp 

along Hor p , i.e. V p is the identity on Vertp and sends all elements of Horp to 
0. Giving Hor p is the same as giving V p and condition (5.3) is the same as the 
condition 

d(n) p o Vp = V rt (p) °d(r b ) p V6eG, P eP. (5.4) 
Let us compose 1 : Vert p — > g with V p . So we define the g valued form uj by 

oJp := Up 1 oVp. (5.5) 

Then it follows from (5.2) and (5.4) that 

r^cJ = Adb uj. (5.6) 

Let £p be the vector field on P which is the infinitesimal generator of r eX p t£ ■ 
In view of definition of w p as identifying £ with the tangent vector to the curve 
1 1 > p(cxpt£) = r oxp _ t £p at i = 0, we see that 

i(£p)w = -£. (5.7) 
The infinitesimal version of (5.6) is 

D €p w=K,w]. (5.8) 
Define the curvature by our formula 

H := dtO + (5.9) 

It follows from (5.6) that 

r* b n = Ad b n \/b e g. (5.10) 

Now 

i(£p)duj = D^pUJ — di(£p)w 

by Weil's formula for the Lie derivative. By (5.7) the second term on the right 
vanishes because it is the differential of the constant — £. So 

i(£ P )duj= 
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On the other hand 

i{£p)p,u] = [i{£pp,u] - p,i(£ P )u] = -2[f,w] 

where we used (5.7) again. So 

i(v)Q = ifueVcrtp. (5.11) 

To understand the meaning of £1 when evaluated on a pair of horizontal 
vectors, let X and Y be pair of horizontal vector fields, that is vector fields whose 
values at every point are elements of Hor. Then i(X)uJ — and i(Y)uJ = 0. So 

U(X, Y) = i(Y)i(X)U = i(Y)i(X)duJ = duJ(X, Y). 

But by our general formula for the exterior derivative we have 

doJ(X, Y) = X(i(Y)uJ) - Y(i(X)w) - uJ([X, Y]). 

The first two terms vanish and so 

U= -lj([X,Y]). (5.12) 

This shows how the curvature measures the failure of the bracket of two hori- 
zontal vector fields to be horizontal. 

5.2.3 Associated bundles. 

Let 7r : P — > M be a principal bundle with structure group G, and let F be 
some manifold on which G acts. We will write this action as multiplication on 
the left; i.e. we will denote the action of an element a E G on an element / e F 
as af. We then have the diagonal action of G on P x F: For a € G we define 

diag(a) : P x F - P x F, diag(a)(p, /) = (pa" 1 , af). 

Since the action of G on P is free, so is its diagonal action on P x F. We 
can form the quotient space of this action, i.e. identify all elements of P x F 
which lie on the same orbit; so we identify the points (p, /) and (pa -1 ,af). The 
quotient space under this identification will be denoted by 

Px G F 

or by 

F(P). 

It is a manifold and the projection map 7r : P — > M descends to a projection of 
F(P) — > M which we will denote by np or simply by ir when there is no danger 
of confusion. The map tt f : F(P) — > M is a fibration. The bundle F(P) is 
called the bundle associated to P by the G-action on F. 
Let 

p: Px F -> P(F) 
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be the map which send (p, /) into its equivalence class. 

Suppose that we are given a connection on the principal bundle P. Recall 
that this means that at each p g P we are given a subspace Hor p c TP p which 
is complementary to the vertical, and that this assignment is invariant under 
the action of G in the sense that 

Hor pa -i = dr a (UoT p ). 

Given an / g F, we can consider Hor p as the subspace 

Hor p x{0} C T(P x F) [pJ) = TP p TF f 

and then form 

dp(p,f) Hor P c T ( F ( P ))p(P,f) 
which is complementary to the vertical subspace V(F(P)) p t p j\ C T(F(P)) p r p jj . 
The invariance condition of Hor implies that c?p( p j)(Hor p ) is independent of the 
choice of (p, /) in its equivalence class. 

So a connection on a principal bundle induces a connection on each of its 
associated bundles. 

5.2.4 Sections of associated bundles. 

If 7r : Y — > X is a submersion, then a section of this submersion is a map 

s : X -> Y 

such that 

7T o s = id. 

In other words, s is a map which associates to each ielan clement 

s(x) eY x = 7r _1 (a;). 

Naturally, we will be primarily interested in sections which are smooth. 

For example, we might consider the tangent bundle TM. A section of the 
tangent bundle then associates to each x € M a tangent vector s(x) g TM X . In 
other words, s is a vector field. Similarly, a linear differential form on M is a 
section of the cotangent bundle T*M . 

Suppose that n = np : F(P) — > M is an associated bundle of a principal 
bundle P, and that s : M — > is a section of this bundle. Let x be a point 

of M, and let p E P x = 7r _1 (x) be a point in the fiber of the principal bundle 
P — > A'/ lying in the fiber over x. Then there is a unique / g F such that 

p((p, /)) = *(*). 

We thus get a function </> s : P — > F by assigning to p this element / g F. In 
other words, 4> s is uniquely determined by 



p((p, Mp)) = s (k(p))- 



(5.13) 
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Suppose we replace p by r a (p) = pa 1 . Since p((pa 1 ,af)) = p{{p, /)) we see 
that <f) s satisfies the condition 

cj>or a = a<j> V a e G. (5-14) 

Conversely, suppose that <f> : P — > F satisfies (5.14). Then 

P((P, 00)) = , <A(pa -1 )) 

and so defines an element x = 7r(p). So a (/> : P — > F satisfying (5.14) 

determines a section s : M —> F(P) with (j> = <fi s . It is routine to check that s 
is smooth if and only if cf) is smooth. We have thus proved 

Proposition 7 There is a one to one correspondence between (smooth) sections 
s : M — > F(P) and (smooth) functions 4> : P ^ F satisfying (5.14). The 
correspondence is given by (5.13). 

An extremely special case of this proposition is where we take F to be the 
real numbers with the trivial action of G on R. Then R(-P) = M x R since 
the map p does not identify two distinct elements of R but merely identifies all 
elements of P x . A section s of M x R is of the form s(x) — (x, f(x)) where / 
is a real valued function, the proposition then asserts that we can identify real 
valued functions on M with real valued functions on P which are constant on 
the fibers P x . 

5.2.5 Associated vector bundles. 

We now specialize to the case that F is a vector space, and the action of G on F 
is linear. In other words, we are given a linear representation of G on the vector 
space F. If x e M we can add two elements v\ and w 2 of F(P) X by choosing 
p € P x which then determines /i and / 2 in F such that 

P((P, h)) = vi and p((p, f 2 )) = v 2 . 

We then define 

v 1 +v 2 := p((p,/i + / 2 ))- 

The fact that the action of G on F is linear guarantees that this definition is 
independent of the choice of p. In a similar way, we define multiplication of an 
element of F(P) X by a scalar and verify that all the conditions for F(P) X to be 
a vector space are satisfied. 

Let V — > M be a vector bundle. So — > M is a fibration for which each V x 
has the structure of a vector space. (As a class of examples of vector bundles 
we can consider the associated vector bundles F(P) just considered.) We can 
then consider V valued differential forms on M. For example, a V valued linear 
differential form r will be a rule which assigns a linear map 



t x : TM X ^V X 
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for each x £ M, and similarly we can talk of V valued fc-forms. 

For the case that V = F(P) is an associated vector bundle we have a gen- 
eralization of Proposition 7 to the case of differential forms. That is, we can 
describe F(P) valued differential forms as certain kinds of F-valued forms on 
P. To see how this works, suppose that r is an F(P)-valued fc- form on M. Let 
x £ M and let p £ P x . Now 

t x : A k (TM x ) — > F(P) X 

and p gives an identification map which we will denote by 

identp 

of F(P) X with F - the element f £ F being identified with p((p, /)) £ F(P) X . 
Also, 

diTp : TP p -> TM X 
and so induces map (which we shall also denote by dn p ) 

dn p : A k (TP p ) -> A k {TM x ). 

So 

a p := identp o t x o d7r p 

maps A k (TP p ) -» P. Thus we have defined an F-valued fc-form <r on P. If t> is 
a vertical tangent vector at any point p of P we have dn p (v) = 0, so 

i(u)<7 = if v £ Vert (P). (5.15) 

Let us see what happens when we replace p by r a (p) = pa^ 1 in the expression 
for a. since ir o r a = 7r, we conclude that 

c?7r pa -i o d(r a ) p = dn p . 

Also, 

ident pa -i = a o ident p 
where the a on the right denotes the action of a on P. We thus conclude that 

r*er = ood. (5.16) 

Conversely, suppose that a is an P-valued fc-form on P which satisfies (5.15) 
and (5.16). It defines an P(P) valued fc-form r on M as follows: At each x £ M 
choose a p £ P x . For any k tangent vectors vi,...,Vk £ TM X choose tangent 
vectors Wi, . . . ,w k £ TP p such that 

dir p (wj)=Vj, j = l,...,k. 

Then consider 

a p (wi A • • • A to*,) e P. 



5.2. PRINCIPAL BUNDLES AND INVARIANT CONNECTIONS. 



119 



Condition (5.15) guarantees that this value is independent of the choice of the 
Wi with dTT p (wj) — Vj. In this way we define a map 

A k (TM x ) -> F. 

If we now apply p(p, •) to the image, we get a map 

/\ k (TM x ) - F(P) X 

and condition (5.16) guarantees that this map is independent of the choice of 
p E P x . From the construction it is clear that the assignments r — > a and a — > r 
are inverses of one another. We have thus proved: 

Proposition 8 T/iere is one to one correspondence between F(P) valued forms 
on M and F valued forms on P which satisfy (5.15) and (5.16). 

Forms on P which satisfy (5.15) and (5.16) are called basic forms because 
(according to the proposition) F-valued forms on P forms on P which (5.15) and 
(5.16) correspond to forms on the base manifold M with values in the associated 
bundle F(P). 

For example, equations (5.10) and (5.11) say that the curvature of a connec- 
tion on a principal bundle is a basic g valued form relative to the adjoint action 
of G on g. According to the proposition, we can consider this curvature as a 
two form on the base M with values in g(P), the vector bundle associated to P 
by the adjoint action of G on its Lie algebra. 

Here is another important illustration of the concept. Equation (5.6) says 
that a connection form uJ satisfies (5.16), but it certainly does not satisfy (5.15). 
Indeed, the interior product of a vertical vector with the linear differential form 
To is given by (5.7). However, suppose that we are given two connection forms 
uJi and oj2- Then their difference uj\ — uj 2 does satisfy (5.15) and, of course, 
(5.16). We can phrase this by saying that the difference of two connections is a 
basic g valued one-form. 

5.2.6 Exterior products of vector valued forms. 

Suppose that F\ and F\ are two vector spaces on which G acts, and suppose 
that we are given a bilinear map 

b : F l x F 2 -y F 3 

into a third vectors space F3 on which G acts, and suppose that b is consistent 
with the actions of G in the sense that 

b(a/i,a/ 2 ) = ab(/i,/ 2 ). 

Examples of such a situation that we have come across before are: 

1. G is a subgroup of Gl(n) and Fi,F 2 and F 3 are all the vector space of 
n x n matrices, and b is matrix multiplication. 
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2. G is a subgroup of Gl(n), F\ is the space of all n x n matrices, F2 and F3 
are R™ and b is multiplication of a matrix times a vector. 

3. G is a general Lie group, F\ = F 2 = F 3 = g, the Lie algebra of G and b 
is Lie bracket. 

In each of these cases we have had occasion to form the exterior product of an 
Fi valued differential form with an F2 valued differential form to obtain an F3 
valued form. 

We can do this construction in general: form the exterior product of an Fx 
valued fc-form with an i^-valued £ form to get and F3 valued k + £ form. For 
example, if . . . , is a basis of F\ and ff , . . . , f% is a a basis of F 2 then the 
most general Fi-valued fc-form a can be written as 

a = 5>'// 

where the a 1 are real valued fc-forms, and the most general ^-valued ^-form /3 
can be written as 

where the are real valued £ forms. Let ff , . . . , /| be a basis of F 3 and define 
the numbers B^- by 

k 

Then you can check that a A (5 defined by 

aAp^^B^a'A^fl 

is independent of the choice of bases. In a similar way we can define the exterior 
derivative of a vector valued form, the interior product of a vector valued form 
with a vector field, the pull back of a vector valued form under a map etc. 
There should be little problem in understanding the concept involved. There 
is a bit of a notational problem - how explicit do we want to make the map b 
in writing down a symbol for this exterior product. In example 1) we simply 
wrote A for the exterior product of two matrix valued forms. This forced us to 
use the rather ugly [a, A/3] for the exterior product of two Lie algebra valued 
forms, where the b was commutator or Lie bracket. We shall retain this ugly 
notation for the sake of the clarity it gives. 

A situation that we will want to discuss in the next section is: we are given 
an action of G on a vector space F, and unless forced to be more explicit, we 
have chosen to denote the action of an element a € G on an clement / e F 
simply by af. This determines a bilinear map 

b : g x F —> F 

by 

b(£,/) := -£(exp*£)/| t=0 . 
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We therefore get an exterior multiplication of a g-valued form with an _F-valucd 
form. We shall denote this particular type of exterior mutliplication by •. So if 
a is a g-valued fc-form and /3 is an F- valued I form then a • (3 is an F-valued 
(k + £)-form. 

We point out that conditions (5.15) and (5.16) make perfectly good sense 
for vector valued forms, and so we can talk of basic vector valued forms on P, 
and the exterior product of two basic vector valued forms is again basic. 

5.3 Covariant differentials and covariant deriva- 
tives. 

In this section we consider a fixed connection on a principal bundle P. This 
means that we are given a projection V of TP onto the vertical bundle and 
therefore a connection form u>. Of course we also have a projection 

id V 

onto the horizontal bundle Hor of the connection, where id is the identity op- 
erator. This projection kills all vertical vectors. 

5.3.1 The horizontal projection of forms. 

If a is a (possibly vector valued) /c-form on P, we will define the horizontal 
projection Ha of a by 

Ha(«i, ...,«*) = a((id - V>i, . . . , (id - V> fc ). (5.17) 

The following properties of H follow immediately from its definition and the 
invariance of the horizontal bundle under the action of G: 

1. H(a A 13) = Ha A H/3. 

2. r* o H = H o r* V a e G. 

3. If a has the property that i(w)a = for any horizontal vector w then 
Ha = 0. In particular, 

4. mj = o. 

5. If a has the property that i(v)a = for any vertical vector v then Ha = a. 
In particular, 

6. H is the identity on basic forms. 

In 1) a and /? could be vector valued forms if we have the bilinear map b which 
allows us to multiply them. 

The map H is clearly a projection in the sense that 

H 2 = H. 
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5.3.2 The covariant differential of forms on P. 

Define d mapping A:- forms into (k + 1) forms by 

d:=Hod. (5.18) 
The following facts are immediate: 

• d(a A (3) = da A H/3 + (-l) fe Ha A d/3 is a is a fc-form. 

• i(w)d = for any vertical vector v. 

• r* o d = d o r* Va e G. 

It follows from the second and third items that d carries basic forms into basic 
forms. 

If F is a vector space on which G acts linearly, we can form the associated 
vector bundle F(P), and we know from Proposition 8 that fc-forms on M with 
values in F(P) are the same as basic /c-forms on P with values in F. So giving 
a connection on P induces an operator d mapping fc-forms on M with values 
in F(P) to (k + l)-forms on M with values in F(P). For example, a section 
s of F(P) is just a zero form on M with values in the vector bundle F(M). 
Giving the connection on P allows us to construct the one form ds with values 
in F(P). If X is a vector field on M, then we can define 

V x s := i(X)ds, 

the covariant derivative of s in the direction X. 

5.3.3 A formula for the covariant differential of basic forms. 

Let a be a basic form on P with values in the vector space F on which G acts 
linearly. Let d be the covariant differential associated with the connection form 
UJ. We claim that 

da = da + uj • a. (5.19) 

In order to prove this formula, it is enough to prove that when we apply i(v) to 
the right hand side we get zero, if v is vertical. For then applying H docs not 
change the right hand side. But applying H to the right hand side yields da 
since da := H(da) and 

Hw = 

so 

U(uj»a) = 0. 

So it is enough to show that for any £ € g we have 

i{ip)da = —i{^p){uj • a. 
Since a is basic, we have i(£,p)a = 0, so by Weil's identity we have 
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by the infinitesimal version of the invariance condition (5.16). On the other 
hand, since i(£p)a = and i(t;p)u> = — £, we have proved our formula. 

There are a couple of special cases of (5.19) worth mentioning. If F is R 
with the trivial representation then (5.19) says that d = d. This implies, that 
if s is a section of an associated vector bundle F(P), and if is a function on 
M, so that (j)s is again a section of F(P) then 

d(<ps) = (d<j>) A s + sds 

implying that for any vector field X on M we have 

V x (<f>s) = (X<j>)s + <P(V x s). 

Another important special case is where we take F = g with the adjoint 
action. Then (5.19) says that 

da = da+ [wA, a]. 

5.3.4 The curvature is duJ. 

We wish to prove that 

dw = dun + ^[wA,w]. (5.20) 

Both sides vanish when we apply i(v) where v is a vertical vector - this is true 
for the left hand side by definition, and we have already verified this for the 
right hand side, see equation (5.11). But if we apply H to both sides, we get 
du) on the left, and also on the right since Huj = 0. □ 

5.3.5 Bianchi's identity. 

In our setting this says that 

dU = 0. 

Proof. We have 

dfl = d{duu) + d-[cJA, UJ] 
Applying H yields zero because Huj = 0. □ 

5.3.6 The curvature and d 2 . 

We wish to show that 

d 2 a = n«a. (5.22) 

In this equation a is a basic form on P with values in the vector space F where 
G acts, and we know that £1 is a basic form with values in g, so the right hand 
side makes sense and is a basic F valued form. To prove this we use our formula 

da = da + oJ • a 



(5.21) 



[dwA, 
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and apply it again to get 

d 2 a = d(da + uj • a) + uj • (da + uj). 
We have d 2 = so the first expression (under the d) becomes 

d(uJ • a) = dm • a — uJ • da. 
The second term on the right here cancels the term Zo • da so we get 

d 2 a = did • a + lo • {to • a). 
So to complete the proof we must check that 

^[c<JA,a7] • a — UJ • (lj • a). 

This is a variant of a computation we have done several times before. Since 
interior product with vertical vectors sends a to zero, while interior product with 
horizontal vectors sends uj to zero, it suffices to verify that the above equation 
is true after we take the interior product of both sides with two vertical vectors, 
say r]p and £p. Now 

i(£ P ) = [uJA,uJ] = + = -2[£,w] 

and so 

KVpji^P^P^M •£*) = • a. 

A similar computation shows that 

i{Vp)i{£p){u • (w» a) = £• (?y»a) (^*a). 

But the equality of these two expressions follows from the fact that we have an 
action of G on F which implies that for any £, 77 G g and any / G F we have 

[C,»?]/ = W) □ 



Chapter 6 

Gauss's lemma. 



We have defined geodesies as being curves which are self parallel. But there 
are several other characterizations of geodesies which are just as important: 
for example, in a Ricmann manifold geodesies locally minimize arc length: "a 
straight line is the shortest distance between two points" . We want to give one 
explanation of this fact here, using the "exponential map," a concept introduced 
by al Biruni (973-1048) but unappreciated for about 1000 years. The key result, 
known as Gauss' lemma asserts that radial geodesies are orthogonal to the 
images of spheres under the exponential map, and this will allow us to relate 
geodesies to extremal properties of arc length. 

6.1 The exponential map. 

Suppose that M is a manifold with a connection V. Let mo be a point of M 
and £ € TM mo . Then there is a unique (maximal) geodesic 7^ with 7^(0) = 
mo, 7 ; (0) = £. It is found by solving a system of second order ordinary differ- 
ential equations. The existence and uniqueness theorem for solutions of such 
equations implies that the solutions depend smoothly on £. In other words, 
there exists a neighborhood N of £ in the tangent bundle TM and an interval 
I about in R such that (rj, s) 1— » 7„(s) is smooth on Af x /. 

If we take £ = 0, the zero tangent vector, the corresponding "geodesic", 
defined for all t is the constant curve 70 (t) = m . The continuity thus implies 
that for £ in some neighborhood of the origin in TM mo , the geodesic 7^ is defined 
for t G [0, 1]. Let T> be the set of vectors £ in TM mo such that the maximal 
geodesic through £ is defined on [0,1]. By the preceding remarks this contains 
some neighborhood of the origin. Define the exponential map 

exp = exp TOQ : V -» M, exp(£) = 7 € (1). (6.1) 

For £ e TM rno and fixed teR the curve 

s 1 ► 7$(is) 
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is a geodesic whose tangent vector at s = is i£. So the exponential map carries 
straight lines through the origin in TM rno into geodesies through in in M: 

exp : t£ h-> 7|(i). 

Now the tangent vector to the line 1 1— > i£ at i = is just £ under the standard 
identification of the tangent space to a vector space with the vector space itself. 
Also, the tangent vector to the curve t i— ► 7^(t) at t = is £, by the definition 
of 7£ . So taking the derivatives of both sides shows that the differential of the 
exponential map is the identity: 

dexp : T(TM mo ) - TM mo = id 

under the standard identification of the tangent space T(TM mo ) with TM mo . 

From the inverse function theorem it follows that the exponential map is a 
diffcomorphism in some neighborhood of the origin. Let U be a star shaped 
neighborhood of the origin in TM mo on which exp is a diffcomorphism, and let 
U := cxp(U) be its image in M under the exponential map. Then U is called 
a normal neighborhood of mo. By construction (and the uniqueness theorem 
for differential equations) for every m € U there exists a unique geodesic which 
joins m to m and lies entirely in U. 

6.2 Normal coordinates. 

Suppose that we choose a basis e = (ei, . . . , e„) of TM mo and let f 1 , . . . , l n be 
the dual basis. We then get a coordinate system on U defined by 

cxp _1 (m) = x l (m)ej 

or, what is the same, 

i'=f o exp -1 . 

These coordinates are known as normal coordinates, or sometimes as inertial 
coordinates for the following reason: 

Let £ = X) a * e i be an clement of U C TM mo . Since exp(t£) = 7|(t) the 
coordinates of 7g(i) are given by 

Thus the second derivative of x l (^/^(t)) with respect to t vanishes and the 
geodesic equations (satisfied by 7f(i)) becomes 

]Tr^(t)) a v = o, vfc. 

In particular, evaluating at t — we get 

^r^.(0)aV=0, Vfc. 

ij 
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But this must hold for all (sufficiently small) values of the a 1 and hence for all 
values of the a\ If the connection has zero torsion, so that the 1^- are symmetric 
in i and j, this implies that 

r£.(o) = o. (6.2) 

In a normal coordinate system, the Christoffcl symbols of a torsionless connec- 
tion vanish at the origin. Hence at this one point, the equations for a geodesic 
look like the equations of a straight line in terms of these coordinates. This was 
Einstein's resolution of Mach's problem: How can the laws of physics - particu- 
larly mechanics - involve rectilinear motion in absence of forces, as this depends 
on the coordinate system. According to Einstein the distribution of matter in 
the universe determines the metric which then determines the connection which 
picks out the inertial frame. 

6.3 The Euler field S and its image V. 

The multiplicative group R + acts on any vector space: r € R + sends any vector 
v into rv. We set 

r = e*. 

The vector field corresponding £ corresponding to the one parameter group 

v i — ► e t v 

is known as the Euler operator. From its definition, if q is a homogeneous 
polynomial of degree k, then 

£q = kq, 

an equation which is known as Euler's equation. Also from its definition, differ- 
entiating the curve 

1 1 ► e t v 

at t = shows that the value of £ at any vector v is 

£(v) = v 

under the natural identification of the tangent space at v of the vector space 
with the vector space itself. 

We want to consider the Euler field on the tangent space TM mg (and its 
restriction to the star shaped neighborhood hi) and its image under the expo- 
nential map, call it V . So V is a vector field defined on U. Since 

cxp(r£) = 7 € (r) 

we have 

nexpO = |(exp(e*0) |t=0 
= ^(e*)| t =o 
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so 



7>(exp0 =7 £ (l) 



where we are using the dot to denote differentiation of the geodesic r i— » 75 (r) 
with respect to r. Applied to the vector s£ we obtain 



We claim that 



Indeed, 



^(7|(«)) = «7?0)- 
V V V = V. 

= *V.y t(t )(t7*(t)) 



(6.3) 



..^7 ? (t) + ^V^ (t) 7€(t) 
= <7£(t) since 7^ is a geodesic 
= P(7*(*))- 

Since the points of the form "f^(t) fill out the normal neighborhood, we conclude 
that (6.3) holds. 

Suppose that we have chosen a basis e = (ei, . . . , e n ) of TM mg and so the 
corresponding normal coordinates x 1 , . . . ,x n on U. Each vector ej determines 
the "constant" vector field on TM mo which assigns to each vector £ the value ej 
(under the identification of (TM mo )^ with TM mo ). Let us temporarily introduce 
the notation ej to denote this vector field. As i 1 , . . . ,£ n form the dual basis, 
then each of the P is a linear function on TM mo , and the derivative of the 
function P with respect to the vector field e, is given by 

€i£ j = 0, 1+3, ^ = 1. 

Now x 1 — P o exp -1 so we conclude that under the exponential map the vector 
field e~j is carried over into di in terms of the normal coordinates. 
Now 

and t{£) = x l (m) if £ = exp _1 (m). We conclude that the expression for V in 
normal coordinates is given by 



(6.4) 



Thus in normal coordinates, the expression for V is the same as the expression 
for the Euler operator £ in linear coordinates. 



6.4 The normal frame field. 



Let Ei be the vector field obtained from di(mo) — &i by parallel translation along 
the 75 (i). By the existence and uniqueness theorem for differential equations, 
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we know that Ei is a smooth vector field on our normal neighborhood N. By 
the definition of V we have 

Vv(Ei) = 0. (6.5) 

Notice that at the single point m we have E^mo) = di(nio) but this equality 
need not hold at any other point. But E = (Ei,...,E m ) is a frame field 
which is covariant constant with respect to V . We call it the normal frame 
field (associated to the basis (ei, . . . ,e„)). We then also construct the dual 
frame field 9 which is also covariantly constant with respect to V '. 
We claim that, remarkably 

V = S ^x l E l . (6.6) 

i 

Indeed, the coefficients 6 l (V) of V with respect to the Ei are smooth functions 
on our normal neighborhood. Our first claim is that these functions are (in 
terms of our normal coordinates) homogeneous functions of order one. To show 
this it is enough, by Euler's theorem, to show that they satisfy the equation 
Vf = f. But we have 

ve\v) = (Vpfl*) + e l (y v v) = e\v) 

since \7-p9 1 = and V-pP = P. So each of the 8 l (V) is a homogenous linear 
function in terms of the normal coordinates. 

This means that we can write 9 l (V) = o-%i x ^ for some constants Thus 

ij k 

We want to show that ay = Sij. By definition, £7,(0) = <9i(0). If we write 



X \ 2 = yv 2 



E 

i 

we have 

J2aijX j Ej = J2x j di + 0{\x\ 2 ) 

ij ij 

and also _ 

Y."u-' J i-:, r 

ij j 

The only way that two linear expressions can agree up to terms quadratic or 
higher is if they are equal, so we have proved that (6.6) holds. 



6.5 Gauss' lemma. 

Now suppose that M is a semi-Riemannian manifold, and V is the corresponding 
Levi-Civita connection. We choose our basis e = (ei, . . . , e„) of TM mo to be 
"orthonormal" , so that E = (E\, ...,£"„) is an "orthonormal" frame field. 
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Since the Ei form an orthonormal frame at each point, it follows from (6.6) 
that _ 

(V,V)=J2^l (6-7) 

i 

We claim that we also have 

{V,d l ) = e t x\ (6.8) 

To prove this observe that 

(V,d l )=J2 x3 (d ] ,d l )=e l x t + 0(\x\ 2 ). 

3 

So it is enough to show that 

V(P,di) = e t x l 

in order to conclude (6.8). Now [P, di] = —di from the formula (6.4) for V, and 
hence 

since the torsion of the Levi-Civita connection vanishes. Hence 

V(V,di) = (V v V,di) + (V,V v di) 

= (V,dJ + (P,V Bt P)-(P,8i) 

= \di{V,V) 



= e»ar'. 

In particular, it follows from (6.8) that 

(P,e j x i d j -e i x j d i ) = 0. (6.9) 

Now the vector fields 

CjX l dj - (,.r'i), 

correspond, under the exponential map, to the vector fields 

which generate the one parameter group of "rotations" in the e^e^ plane in 
TM mo . These rotations, acting in the tangent space, when applied to any point, 
sweep out the "pseudo-sphere" centered at the origin and passing through that 
point. Let be the pseudo-sphere in the tangent space TM mo passing through 
the point £ e TM mo and let S p = exp(5*j) be its image under the exponential 
map. Then we can restate equation (6.9) as 

Proposition 9 The radial geodesic through the point p = exp(£) is orthogonal 
in the Riemann metric to the hyper surface T, p . 

This result is known as Gauss' lemma. 
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6.6 Minimization of arc length. 

We now specialize to the Riemannian case so that is an actual sphere in the 
Euclidean sense. Let 7 : [0. 1] — > M be any curve joining mo to a point m in 
the normal neighborhood. In particular, for small values of t the points j(t) all 
lie in the normal neighborhood. Let 



i.e. d 2 = J2i x * 2 m terms of the normal coordinates. We know that d is the 
length of the geodesic emanating from mo and ending at m by the definition of 
the exponential map and normal coordinates. We wish to show that 



In other words, that the geodesic joining mo to m is the shortest curve joining 
mo to m. Since 7(1) = m we have |7(1)| = d. 

Let T be the first time that |a;(7(£))| > d. (That is, T is the greatest lower 
bound of the set of all t for which j(t) does not lie strictly inside the sphere 
of radius d in normal coordinates.) Then j(T) must lie on the surface S, the 
image of the sphere of radius d under the exponential map. It is enough to 
prove that curve 7 : [0,T] — > M has length > d, where now x(j(t)) lies inside 
the sphere of radius d for all < t < T. By the same argument, we may assume 
that \x(j(t))\ > for all t > 0. Then 



Let u denote the unit vector field in the radial direction, defined outside the 
origin in the normal coordinates. So u(x) = j^V(x). Decompose the tangent 
vector, j'(t) into its component along u and its component, r along the plane 
spanned by the vector fields x % dj — x^di. So 



On the other hand, u(t) and r(i) are orthogonal relative to the Ricmann metric, 
and hence 



d = \x(m)\ 



length of 7 > d. 




j'(t)=c(t)u(t) + r(t). 



Then 




liy(*)l| 2 = |c(i)| 2 +||r(i)|| 2 



so 



W(t)\ > \c(t)\ 



and hence 




as was to be proved. 
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Chapter 7 

Special relativity 



7.1 Two dimensional Lorentz transformations. 

We study a two dimensional vector space with scalar product ( , ) of signature 

H .A Lorentz transformation is a linear transformation which preserves the 

scalar product. In particular it preserves 



(where with the usual abuse of notation this expression can be positive negative 
or zero). In particular, every such transformation must preserve the "light cone" 
consisting of all u with ||u|| 2 = 0. 

All such two dimensional spaces are isomorphic. In particular, we can choose 
our vector space to be R 2 with metric given by 



The light cone consists of the coordinate axes, so every Lorentz transformation 
must carry the axes into themselves or interchange the axes. A transforma- 
tion which preserves the axes is just a diagonal matrix. Hence the (connected 
component of) the Lorentz group consists of all matrices of the form 



So the group is isomorphic to the multiplicative group of the positive real num- 



l|u|| 2 :- (u,u) 





r > 0. 
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bers. We introduce (i, x) coordinates by 



u 
v 
or 



t + x 

t — X 



so 



1 1 

1 -1 



I 2 = t 2 -x 2 . 



Notice that 



so if 



1 1 

1 -1 



1 1 

1 -1 

r 

r- 1 

1 1 

1 -1 



1 
1 



t 

x 

u 
v 



then 



a 

x 1 



1 1 

1 -1 



Multiplying out the matrices gives 

f 
x' 

where 



r 
r- 1 



1 w 
w 1 



r + r 1 



r — r 



1 1 

1 -1 



w := 



t 

x 



(7.1) 

(7.2) 
(7.3) 



r + r 1 ' 

The parameter w is called the "velocity" and is, of course, restricted by 

M < 1. (7.4) 

We have 

r 2 + 2 + r~ 2 - r 2 + 2 - r~ 2 



1 - w z 



( r + r -l)2 



( r + r -l)2 
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so 



7=^^- (7-5) 
yl - w z 



Thus w determines 7. Similarly, we can recover r from w: 



1 + w 



1 — w 

So we can use w to parameterize the Lorentz transformations. We write 



L w := 7 



1 w 
w 1 



7.1.1 Addition law for velocities. 

It is useful to express the multiplication law in terms of the velocity parameter. 
If 



Wi 

w 2 

then 

rs — (rs)^ 1 
rs + (rs)- 1 



r + r 1 

s — s^ 1 
s + s- 1 



r—r 1 s—s 

1 s— s _1 1 — r~ 



s+s 1 r+r 



so we obtain 



L Wl oL W2 =L w where w = * + — — . (7.6) 

1 + wiw 2 



This is know as the "addition law for velocities". 
7.1.2 Hyperbolic angle. 

One also introduces the "hyperbolic angle" , actually a real number, (f> by 

r = 

so 

7 = cosh (j) = 



and 

_ / cosh</> sinh^ 
™ y sinh^> cosh</> 

Here 

to = tanh (A. 
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For any 



with t > and t 2 — x 2 — 1, wc must have t > x and 



t — x = (t + x) 1 so 



t + x 
t — X 



r 

r- 1 



r = t + x. 



This shows that the group of all (one dimensional proper) Lorentz transfor- 
mations, {L w }, acts simply transitively on the hyperbola 



This means that if 
is a unique L w with 

If 

this means that 



and 



a 

x' 

t 

X 



= 1, t>0. 

are two points on this hyperbola, there 
t' 



L W L Z 



= L, 



L r L„ 



and so 



, ) = tt' - xx' = ( 



1 Lyj 



Writing w = tanh (f> as above we have 











-1 


, X 







and <j> is called the hyperbolic angle between u and u'. 

More generally, if we don't require ||u|| = ||u'|| = 1 but merely ||u|| > 
0, ||u'|| > 0,t > 0,t' > wc define the hyperbolic angle between them to be 
the hyperbolic angle between the corresponding unit vectors so 

(u, u') = ||u||||u'||cosh0. 

7.1.3 Proper time. 

A material particle is a curve a : r a(r) whose tangent vector a'(r) has 
positive t coordinate everywhere and satisfies 



\a'(r) 



1. 
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Of course, this fixes the parameter r up to an additive constant, t is called the 
proper time of the material particle. It is to be thought of as as the "internal 
clock" of the material particle. For an unstable particle, for example, it is this 
internal clock which tells the particle that its time is up. Let do denote unit 
vector in the t direction, 

' 1 



d :-- 







7.1.4 Time dilatation. 



Let us write t(r) for the t coordinate of a(r) and x(t) for its x coordinate so 
that 

t(r) \ , da _ ( dt/dr 

x(t) J dr \ dxdr 



a{r) 
We have 



dt 



= (do, a') 

= cosh <j> 
1 



> l 



2 



where 



dx dx/dr 
w := — = , ' , (7.7) 
dt dt/dr y ' 

is the "velocity" of the particle measured in the t, x coordinate system. Thus 
the internal clock of a moving particle appears to run slow in any coordinate 
system where it is not at rest. This phenomenon, known as "time dilatation" is 
observed all the time in elementary particle physics. For example, fast moving 
muons make it from the upper atmosphere to the ground before decaying due 
to this effect. 



7.1.5 Lorentz-Fitzgerald contraction. 

Let a and fj be material particles whose trajectories are parallel straight lines. 
Once we have chosen a Minkowski basis, we have a notion of "simultaneity" 
relative to that basis, meaning that we can adjust the arbitrary additive constant 
in the definition of the proper time of each particle so that the two parallel 
straight lines are given by 

We can then think of the configuration as the motion of the end points of a 
"rigid rod" of length I. The length I depends on our notion of simultaneity. For 
example, suppose we apply a Lorentz transformation L w to obtain a = 1,6 = 
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(and readjust the additive constants in the clocks to achieve simultaneity). The 
corresponding frame is called the rest frame of the rod and the its length, ^ res t, 
called the rest length of the rod is related to our "laboratory frame" by 

^rest = ( cosh <t>) 4ab 

or 

^lab = Vl- unrest, (7-8) 
a moving object "contracts" in the direction of its motion. This is the Lorcntz- 
Fitzgerald contraction which was discovered before special relativity in the con- 
text of electromagnetic theory, and can be considered as a forerunner of special 
relativity. As an effect in the laboratory, it is not nearly as important as time 
dilatation. 

7.1.6 The reverse triangle inequality. 

Consider any interval, say [0,T], on the t axis, and let < s < T. The curve 
t 2 — x 2 = s 2 bends away from the origin. In other words, all other vectors with 
t coordinate equal to s have smaller Minkowski length: 



x 



|2 „ 2 



<s z , x^O. 



The length of any timclike vector u:= ^ S J is < s if x 7^ 0. Similarly, 

the Minkowski length of the (timclike) vector, v, joining ^ * ^ to ^ ^ ^ is 
<T — s. We conclude that 

||u + v|| > ||u|| + ||v|| (7.9) 

with equality holding only if u and v actually lie on the t axis. There is nothing 
special in this argument about the t axis, or the fact that we are in two dimen- 
sions. It holds for any pair of forward timclike vectors, with equality holding if 
and only if the vectors arc collincar. Inequality (7.9) is known as the reverse 
triangle inequality. The classical way of putting this is to say that the time 
measured by a clock moving along a (timelike) straight line path joint the events 
P and Q is longer than the time measured along any (timelike forward) broken 
path joining P to Q. It is also called the "twin effect". The twin moving along 
the broken path (if he survives the bumps) will be younger than the twin who 
moves along the uniform path. This was known as the twin paradox. It is no 
paradox, just an immediate corollary of the reverse triangle inequality. 

7.1.7 Physical significance of the Minkowski distance. 



We wish to give an interpretation of the Minkowski square length (due originally 

ti 




to Robb (1936)) in terms of signals and clocks. Consider points ( ^ ) and 
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*2 





on the t axis which are joined to the point 



parallel to t = x or t = —x). Then (assuming t 2 > t > t\) 



t - t x 

h 

t 2 -t 
t 2 



x 
t- 

X 



so 

X 

so 



and 



t + x 



hence 



ht 2 = t 



This equation has the following significance: Point P = 
uniform motion wishes to communicate with point Q = 



by light rays (lines 



(7.10) 
at rest or in 
It records the 



time, ti on its clock when a light signal was sent to Q and the time t 2 when the 
answer was received (assuming an instantaneous response.) Even though the 
individual times depend on the coordinates, their product, t\t 2 gives the square 
of the Minkowski norm of the vector joining P to Q. 



7.1.8 Energy-momentum 

In classical mechanics, a momentum vector is usually considered to be an ele- 
ment of the cotangent space, i.e the dual space to the tangent space. Thus in 
our situation, where we identify all tangent spaces with the Minkowski plane 
itself, a "momentum" vector will be a row vector of the form /i = (E,p). For 
a material particle the associated momentum vector, called the "energy mo- 
mentum vector" in special relativity, is a row vector with the property that the 
evaluation map 

v i ^ /j,(v) 

for any vector v is a positive multiple of the scalar product evaluation 

v i— » (v, a'(r)). 

In other words, evaluation under \x is the same as scalar product with ma' where 
m, is an invariant of the material particle known as the rest mass. The rest 
mass is an invariant of the particle in question, constant throughout its motion. 
So in the rest frame of the particle, where a' — do, the energy momentum vector 
has the form (to, 0). Here m is identified (up to a choice of units, and we will 
have more to say about units later) with the usual notion of mass, as determined 
by collision experiments, for example. In a general frame we will have 

H=(E,p), E 2 -p 2 = m 2 . (7.11) 



In this frame we have 



p/E = w 



(7.12) 
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where w is the velocity as defined in (7.7). We can solve equations (7.11) and 
(7.12) to obtain 

E = (7.13) 



w 



\/T — w 

For small values of w we have the Taylor expansion 

1 , 1 2 

1 + -w 2 + ■■■ 



P = n r (7-14) 



W 



2 



and so we have 



E = m+^mw 2 -\ (7.15) 

p = raw + -mw 3 + ■ ■ ■ . (7-16) 

The first term in (7.16) looks like the classical expression p = mw for the 
momentum in terms of the velocity if we think of m as the classical mass, 
and the second term in (7.15) looks like the classical expression for the kinetic 
energy We arc thus led to the following modification of the classical definitions 
of energy and momentum. Associated to any object there is a definite value of 
m called its rest mass. If the object is at rest in a given frame, its rest mass 
coincides with the classical notion of mass; when it is in motion relative to a 
given frame, its energy momentum vector is of the form {E, p) where E and p are 
determined by equations (7.13) and (7.14). We have been implicitly assuming 
that m > which implies that \w\ < 1. We can supplement these particles by 
particles of rest mass whose energy momentum vector satisfy (7.11), so have 
the form (E,±E). These correspond to particles which move along light rays 
x = ±t. The law of conservation of energy momentum says that in any collision 
the total energy momentum vector is conserved. 



7.1.9 Psychological units. 

Our description of two dimensional Minkowski geometry has been in terms of 
"natural units" where the speed of light is one. Points in our two dimensional 
space time are called events. They record when and where something happens. 
If we record the total events of a single human consciousness (say roughly 70 
years measured in seconds) and several thousand meters measured in seconds, 
we get a set of events which is enormously stretched out in one particular time 
direction compared to space direction, by a factor of something like 10 18 . Being 
very skinny in the space direction as opposed to the time direction we tend to 
have a preferred splitting of spacetime with space and time directions picked 
out, and to measure distances in space with much smaller units, such as meters, 
than the units we use (such as seconds) to measure time. Of course, if we use a 
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small unit, the corresponding numerical value of the measurement will be large; 
in terms of human or "ordinary units" space distances will be greatly magnified 
in comparison with time differences. This suggests that we consider variables T 
and X related to the natural units t and x by T = c~ 1 t, X = x or 

x) = { l)(x 

The light cone \t\ = \x\ goes over to \X\ = c\T\ and we say that the "speed of 
light is c in ordinary units" . Similarly, the time- like hyperbolas t 2 — x 2 = k > 
become very flattened out and are almost the vertical lines T =const., lines 
of "simultaneity" . To find the expression for the Lorentz transformations in 
ordinary units, we must conjugate the Lorentz transformation, L, by the matrix 
c- 1 



1 



so 



, J" 1 \ T ( c 
17 = 1 1 ) ( 1 



cosh <j) c 1 sinh 
c sinh 6 cosh <b 



1 

cw 



= 7 



where L = L w . Of course w is a pure number in natural units. In psychological 
units we must write w = v/c, the ratio of a velocity (in units like meters per 
second) to the speed of light. Then 

M = M V = 1 ( 1 f ), 7= \ — -. (7.17) 

l" 1 / (l-f) 1 ^ 1 ' 

Since we have passed to new coordinates in which 

T x )|| 2 = c 2 T 2 -X 2 , 

the corresponding metric in the dual space will have the energy component 
divided by c. As we have used cap for energy and lower case for momentum, we 
shall continue to denote the energy momentum vector in psychological units by 
(E,p) and we have 

\\(E, P )\\ 2 = ^-p 2 . 

We still must see how these units relate to our conventional units of mass. For 
this, observe that we want the second term in (7.15) to look like kinetic energy 
when E is replaced by E/c, so we must rescale by m i— > mc. Thus we get 

\\(E,p)\\ 2 =^-p 2 = m 2 c 2 . (7.18) 



142 



CHAPTER 7. SPECIAL RELATIVITY 



So in psychological coordinates we rewrite (7. 11)- (7. 15) as (7.18) together with 



p 

E 


V 
C2 


(7.19) 


E = 


mc 2 


(7.20) 


(l-v^/c 2 ) 1 / 2 


P = 


mv 


(7.21) 


(l-v 2 /c 2 y/ 2 


E = 


mc 2 + -mv 2 + ■ ■ ■ 


(7.22) 


P = 


1 V 3 

mv + -m — H . 

2 c z 


(7.23) 



Of course at velocity zero we get the famous Einstein formula E — mc 2 . 

7.1.10 The Galilean limit. 

In "the limit" c — > oo the transformations M v become 

G '-{1 °) 

which preserve T and send X i— > X + vT. These are known as Galilean trans- 
formations. They satisfy the more familiar addition rule for velocities: 

G Vl o G V2 — G Vl + V2 . 

7.2 Minkowski space. 

Since our everyday space is three dimensional, the correct space for special 
relativity is a four dimensional Lorentzian vector space. This key idea is due to 
Minkowski. In a famous lecture at Cologne in September 1908 he says 

Henceforth space by itself, and time by itself are doomed to fade 
away into mere shadows, and only a kind of union of the two will 
preserve an independent reality. 

Much of what we did in the two dimensional case goes over unchanged to 
four dimensions. Of course, velocity, w or v, become vectors, w and v as 
does momentum, p instead of p. So in any expression a term such as v 2 must 
be replaced by ||v|| 2 , the three dimensional norm squared, etc.. With this 
modification the key formulas of the preceding section go through. We will not 
rewrite them. The reverse triangle inequality and so the twin effect go through 
unchanged. 

Of course there are important differences: the light cone is really a cone, 
and not two light rays, the space- like vectors form a connected set, the Lorentz 
group is ten dimensional instead of one dimensional. We will study the Lorentz 



7.2. MINKOWSKI SPACE. 



143 



group in four dimensions in a later section. In this section we will concentrate 
on two-particle collisions, where the relative angle between the momenta gives 
an additional ingredient in four dimensions. 



7.2.1 The Compton effect. 

We consider a photon (a "particle" of mass zero) impinging on a massive particle 
(say an electron) at rest. After the collision the the photon moves at an angle, 
6, to its original path. The frequency of the light is changed as a function of the 
angle: If A is the incoming wave length and A' the wave length of the scattered 
light then 

A' = A+— (l-cosfl), (7.24) 
mc 

where h is Planck's constant and m is the mass of the target particle. The 
expression 

h_ 

mc 

is known as the Compton wave length of a particle of mass m. 

Compton derived (7.24) from the conservation of energy momentum as fol- 
lows: We will work in natural units where c = 1. Assume Einstein's formula 

^photon = hv ( 7 - 25 ) 
for the energy of the photon, where v is the frequency, or equivalently, 

^photon = \ ( 7 - 26 ) 

where A is the wave length. Work in the rest frame of the target particle, so 
its energy momentum vector is (m, 0, 0, 0). Take the x— axis to be the direction 
of the incoming photon, so its energy momentum vector is (|, |,0, 0). Assume 
that the collision is elastic so that the outgoing photon still has mass zero and 
the recoiling particle still has mass m. Choose the y— axis so that the outgoing 
photon and the recoiling particle move in the x,y plane. Then the outgoing 
photon has energy momentum cos 9, jj sin 9, 0) while the recoiling parti- 

cle has energy momentum (E,p x ,p y ,0) and conservation of energy momentum 
together with the assumed elasticity of the collision yield 

h h _ 

x + m = x> +E 

h h 
A = X 7 ° 0S 

= sin 9 + p y 
A 

2 77i2 2 2 

m z = E z -pi-p A y . 

Substituting the second and third equations into the last gives 

r,2 2 h 2 h 2 n h 2 

E 2 = m 2 +Y2+ - 2 -2-co S 
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while the first equation yields 

, 2 h 2 h 2 r h h h 2 ' 

E = m + x 2 + V 2+2 [ m x- m x>-xx>/ 

Comparing these two equations gives Compton's formula, (7.24). 

Notice that Compton's formula makes three startling predictions: that the 
shift in wavelength is independent of the wavelength of the incoming radiation, 
the explicit nature of the dependence of this shift on the scattering angle, and 
an experimental determination of h/mc, in particular, if h and c are known, of 
the mass, m, of the scattering particle. These were the results of Compton's 
experiment. 

It is worth recalling the historical importance of Compton's experiment 
(1923). At the end of the nineteenth century, statistical mechanics, which 
had been enormously successful in explaining many aspects of thermodynamics, 
yielded wrong, and even non-sensical, predictions when it came to the study of 
the electromagnetic radiation emitted by a hot body - the study of "blackbody 
radiation". In 1900 Planck showed that the paradoxes could be resolved and 
a an excellent fit to the experimental data achieved if one assumed that the 
electromagnetic radiation is emitted in packets of energy given by (7.25) where 
h is a constant, now called Planck's constant, with value 

h = 6.26 x 10~ 27 erg s. 

For Planck, this quantization of the energy of radiation was a property of the 
emission process in blackbody radiation. In 1905 Einstein proposed the radical 
view that (7.25) was a property of the electromagnetic field itself, and not of any 
particular emission process. Light, according to Einstein, is quantized according 
to (7.25). He used this to explain the photoelectric effect: When light strikes 
a metallic surface, electrons arc emitted. According to Einstein, an incoming 
light quantum of energy hv strikes an electron in the metal, giving up all its 
energy to the electron, which then uses up a certain amount of energy, w, to 
escape from the surface. The electron may also use up some energy to reach the 
surface. In any event, the escaping electron has energy 

E <hv - w 

where w is an empirical property of the material. The startling consequence 
here is that the maximum energy of the emitted electron depends only on the 
frequency of the radiation, but not on the intensity of the light beam. Increas- 
ing the intensity will increase the number of electrons emitted, but not their 
maximum energy. Einstein's theory was rejected by the entire physics com- 
munity. With the temporary exception of Stark (who later became a vicious 
nazi and attacked the theory of relativity as a Jewish plot) physicists could not 
accept the idea of a corpuscular nature to light, for this seemed to contradict 
the well established interference phenomena which implied a wave theory, and 
also contradicted Maxwell's equations, which were the cornerstone of all of the- 
oretical physics. For a typical view, let us quote at length from Millikan (of oil 
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drop fame) whose experimental result gave the best confirmation of Einstein's 
predictions for the photoelectric effect. In his Nobel lecture (1924) he writes 

After ten years of testing and changing and learning and sometimes 
blundering, all efforts being directed from the first toward the accu- 
rate experimental measurement of the energies of emission of photo- 
electrons, now as a function of temperature, now of wavelength,now 
of material (contact c.m.f. relations), this work resulted, contrary to 
my own expectation, in the first direct experimental proof in 1914 of 
the exact validity, within narrow limits of experimental error, of the 
Einstein equation, and the first direct photoelectric determination 
of Planck's h. 

But despite Millikan's own experimental verification of Einstein's formula for the 
photoelectric effect, he did not regard this as confirmation of Einstein's theory 
of quantized radiation. On the contrary, in his paper, "A direct Photoelectric 
Determination of Planck's h" Phy. Rev. 7 (1916)355-388 where he presents his 
experimental results he writes: 

... the semi-corpuscular theory by which Einstein arrived at his 
equation seems at present to wholly untenable.... [Einstein's] bold, 
not to say reckless [hypothesis] seems a violation of the very con- 
ception of electromagnetic disturbance... [it] flics in the face of the 
thoroughly established facts of interference.... Despite... the appar- 
ently complete success of the Einstein equation, the physical theory 
of which it was designed to be the symbolic expression is found so 
untenable that Einstein himself, I believe, no longer holds to it, and 
we are in the position of having built a perfect structure and then 
knocked out entirely the underpinning without causing the building 
to fall. It stands complete and apparently well tested, but without 
any visible means of support. These supports must obviously exist, 
and the most fascinating problem of modern physics is to find them. 
Experiment has outrun theory, or , better, guided by an erroneous 
theory, it has discovered relationships which seem to be of the great- 
est interest and importance, but the reasons for them are as yet not 
at all understood. 

Of course, Millikan was mistaken when he wrote that Einstein himself had 
abandoned his own theory. In fact, Einstein extended his theory in 1916 to 
include the quantization of the momentum of the photon. But for Millikan, as 
for most physicists, Einstein's hypothesis of the light quantum was clearly "an 
erroneous theory" . 

By the way, it is amusing to compare Millikan's actual state of mind in 
1916 (which was the accepted view of the entire physics community outside of 
Einstein) with his fallacious account of it in his autobiography (1950) pp. 100- 
101, where he writes about his experimental verification of Einstein's equation 
for the photoelectric effect: 
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This seemed to me, as it did to many others, a matter of very great 
importance, for it rendered what I will call Planck's 1912 explosive or 
trigger approach to the problem of quanta completely untenable and 
proved simply and irrefutedly, I thought, that the emitted electron 
that escapes with the energy hv gets that energy by the direct transfer 
ofhv units of energy from the light to the electron and hence scarcely 
permits of any interpretation than that which Einstein had originally 
suggested, namely that of the semi-corpuscular or photon theory of 
light itself. 

Self-delusion or outright mendacity? In general I have found that one can not 
trust the accounts given by scientists of their own thought processes, especially 
those given many years after the events. 

In any event, it was only with the Compton experiment, that Einstein's 
formula, (7.25) was accepted as a property of light itself. 

For a detailed history see the book The Compton Effect by Roger H. Stuewer, 
Science History Publications, New York 1975, from which I have taken the above 
quotes. 

7.2.2 Natural Units. 

In this section I will make the paradoxical argument that Planck's constant and 
(7.25) have a purely classical interpretation: Like c, Planck's constant, h, may 
be viewed as a conversion factor from natural units to conventional units. 

For this I will again briefly call on a higher theory, symplectic geometry. 
In that theory, conserved quantities are associated to continuous symmetries. 
More precisely, if G is a Lie group of symmetries with Lie algebra g, the moment 
map, <f> for a Hamiltonian action takes values in g* , the dual space of the Lie 
algebra. A basis of g determines a dual basis of g* . In the case at hand, the Lie 
algebra in question is the algebra of translation, and the moment map yields 
the (total) energy-momentum vector. Hence if we measure translations in units 
of length, then the corresponding units for energy momentum should be inverse 
length. In this sense the role of Planck's constant in (7.26) is a conversion factor 
from natural units of inverse length to the conventional units of energy. So we 
interpret h =6.626xl0~ 27 erg s as the conversion factor from the natural units 
of inverse seconds to the conventional units of ergs. 

In order to emphasize this point, let us engage in some historical science 
fiction: Suppose that mechanics had developed before the invention of clocks. 
So we could observe trajectories of particles, their collisions and deflections, 
but not their velocities. For instance, we might be able to observe tracks in a 
bubble chamber or on a photographic plate. If our theory is invariant under 
the group of translations in space, then linear momentum would be an invariant 
of the particle; if our theory is invariant under the group of three dimensional 
Euclidean motions, the symplectic geometry tells that ||p||, the length of the 
linear momentum is an invariant of the particle. In the absence of a notion of 
velocity, we might not be able to distinguish between a heavy particle moving 
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slowly or a light particle moving fast. Without some way of relating momentum 
to length, we would introduce "independent units" of momentum, perhaps by 
combining particles in various ways and by performing collision experiments. 
But symplcctic geometry tells us that the "natural" units of momentum should 
be inverse length, and that de Broglie's equation 



gives Planck's constant as a conversion factor from natural units to conventional 
units. In fact, the crucial experiment was the photo-electric effect, carried out 
in detail by Millikan. 

The above discussion does not diminish, even in retrospect, from the radical 
character of Einstein's 1905 proposal. Even in terms of "natural units" the 
startling proposal is that it is a single particle, the photon, which interacts 
with a single particle, the electron to produce the photoelectric effect. It is this 
"corpuscular" picture which was so difficult to accept. Furthermore, it is a bold 
hypothesis to identify the "natural units" of the photon momentum with the 
inverse wave length. 

For reasons of convenience physicists frequently prefer to use h := h/2n as 
the conversion factor. 

One way of choosing natural units is to pick some particular particle and 
use its mass as the mass unit. Suppose we pick the proton. Then mp, the mass 
of the proton is the basic unit of mass, and £p, the Compton wave length of 
the proton is the basic unit of length. Also tp, the time it takes for light to 
travel the distance of one Compton wave length, is the basic unit of time. The 
conversion factors to the cgs system (using h) are: 



We will oscillate between using natural units and familiar units. Usually, we 
will derive the formulas we want in natural units, where the computations are 
cleaner and then state the results in conventional units which are used in the 
laboratory. 

7.2.3 Two-particle invariants. 

Suppose that A and B are particles with energy momentum vectors pa and 
p B . In any particular frame they have the expression pa = (Ea,Pa) an d Pb = 
(E B ,p B ). We have the three invariants 




(7.27) 



m P = 1.672 x lCT 24 ? 
l P = .211 x l(T 13 cm 
t P = 0.07 x lQ- 23 sec. 



Pa = m A = e a - (Pa, Pa) 
Pb = m 2 B = E%-(p B ,p B ) 
Pa-PB = E A E B - (pa,Pb)- 
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For the purpose of this section our notation is that ( , ) refers to the three 
dimensional scalar product, a symbol such as pa denotes the energy momentum 
(four) vector of particle A, pa • Pb denotes the four dimensional scalar product 
and we write p\ for pa • Pa- These are all standard notations. The left hand 
sides are all invariants in the sense that their computation does not depend 
on the choice of frame. Many computations become transparent by choosing 
a frame in which some of the expressions on the right take on a particularly 
simple form. It is intuitively obvious (and also a theorem) that these are the 
only invariants - that any other invariant expression involving the two momenta 
vectors must be a function of these three. For example, 

(PA +Pb) 2 =Pa + 2 Pa -Pb+Pb 

and 

(PA - Pb) 2 =P 2 a - 2 Pa -Pb+P 2 b - 
Here are some examples: 



Decay at rest. 

Particle A, at rest, decays into particles B and C, symbolically A — > B+C. Find 
the energies, and the magnitudes of the momenta and velocities of the outgoing 
particles in the rest frame of particle A. Conservation of energy momentum 
gives 

Pa = Pb+ Pc or 
PC = Pa-Pb, 
so 

Pc = Pa+Pb- 2 Pa ■ Pb or 
t?ip = m A + m 2 B — 2m A E B since 
Pa = (mA, 0,0,0). 



Solving gives 



•c 



2mA 

Interchanging B and C gives the formula for Ec- We have Pb — ~Pc and 
E B — ||ps|| 2 = m B . Substituting into the above expression for Eb gives 

||ps|| 2 = - — ft (m\ + m B + mp — 2m 2 4 m 2 7 — 2m 2 3 m 2 7 + 2m A m B — 4m A m B ) 
so 

IIpbII = \\p c H = ^^<^ (7.28) 
2mA 

where A is the "triangle function" 



\(x, y, z) := x 2 + y 2 + z 2 — 2xy — 2xz — 2yz. 



(7.29) 
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If we now redo the computation in ordinary units keeping track of dividing by 
c 2 in the energy part of the scalar product and multiplying all m's by c we get 



171 A + m B - m c 2 



E B = ' - B °V (7.30) 

2mA 

IIpb|| = ||pc|| = ^p^c. (7.31) 

Equation (7.12) becomes 

v - |p. (7.32) 

Taking the magnitudes and using the above formulas for the energies and mag- 
nitudes of the momenta give the magnitudes of the velocities of the outgoing 
particles. Since our energies are all non-negative, (7.30) shows that decay from 
rest can not occur unless vtia > tub + mc- (Similarly the expression for 1 1 p s 1 1 
would become imaginary if tua < tub + mc-) 



Energy, momenta, and velocities in the center of momentum system. 

Suppose we have particles B and C with energy momenta ps and pc is some 
coordinate system, and we want to find the expression for their energy, momenta, 
and velocity in their center of momentum system. 

To find these values for particles B and C we can apply the following trick. 
Consider an imaginary particle, A, whose energy momentum vector is pb+ Pe- 
lts mass 2 is given by 

m\ = m 2 B + ml* + 2p B • Pc- 

Plugging this value for itia into the formulas of the previous subsection gives 
the desired answers. 



Colliding beam versus stationary target. 

A beam of particles of type A smashes into a stationary target of particles of 
type A, in the hope of producing the reaction 

A + A^> A + A + A + A 

where A denotes the antiparticle of A. (All we have to know about the an- 
tiparticlc is that is has the same mass as A.) What is kinetic energy needed to 
produce this reaction? 

Let m denote the mass of A. In the laboratory frame, the stationary, target 
particle has energy momentum vector (m, 0,0,0) while the incoming particle 
has energy momentum vector (E,p). Thus before the collision, we have p^ G ^ = 
(E + m, p) and hence 

Ptot = (£ + ™) 2 HlpH 2 - 

In the center of momentum frame, the threshold for production of the four 
particles will be when there is no energy left over for motion, so they are all four 
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at rest. The total energy momentum vector, call it q, for the four particles will 
then be q = (4m, 0,0,0) in the center of momentum system, hence q 2 = (4m) 2 
and so p 2 ot = q 2 implies 

(E + m) 2 - ||p|| 2 = (Am) 2 . 
But E 2 - ||p|| 2 = m 2 so we get 

2mE + m 2 = 15m 2 

or 

E = 7m. 

In ordinary units we would write this as 

E = 7mc 2 . 

Now E — mc 2 +kinetic energy + • • • so approximately 6mc 2 of kinetic energy 
must be supplied. 

On the other hand, if we shoot two beams of particles of type A against one 
another, then for the collision, the laboratory frame and the center of momentum 
frame coincide, and the incoming total energy momentum vector is (2E, 0, 0, 0) 
and our conservation equation becomes 4E = 4m. We thus must supply kinetic 
energy equal to about m to each particle, or a total energy of about 2mc 2 in 
ordinary units. Comparing the two experiments we see that the colliding beam 
experiment is more energy efficient (by a factor of three). Today virtually all 
new machines for collision experiments arc colliders for this reason. 

7.2.4 Mandlestam variables. 

We consider a two body scattering event with a two body outcome, so 

A + B -> C + D. 

Both the incoming and the outgoing particles can exist in various states, and 
it is the role of any quantum mechanical theory to yield a probability ampli- 
tude for a pair of incoming states to scatter into a pair of outgoing states. In 
general, the states are characterized by various "internal" parameters such as 
spin, isospin etc., in addition to their momentum. However we shall consider 
the situation where the only important parameters describing the states are 
their momenta. So the quantum mechanical theory is to provide the transition 
amplitude, T(pa,Pb,Pc,Pd), a complex number such that \T\ 2 gives the rela- 
tive probability of two entering states with energy momentum vectors pa and 
p B to scatter to the outgoing states with energy momenta pc and po- This 
looks like a function of four vectors, i.e. of sixteen variables, but Lorcntz in- 
variance and conservation of energy momentum implies that there are only two 
free variables. Indeed, Lorentz invariance implies that T should be a function 
of various scalar products p A ,PA • Pb etc., of which there are ten in all. Of 
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these, p 2 A = m A ,p B = mr B ,p c = m 2 -,, and p 2 D = m 2 D are parameters of the 
particles, and hence do not vary, leaving the six products of the form pa ■ Pb 
etc. as variables. But these are constrained by conservation of energy momen- 
tum, pa + Pb = Pc + Pd which provides four equations, leaving only two of the 
products independent. It turns out, for reasons of "crossing symmetry" that is 
convenient to use two of the three Mandelstam variables defined by 

s := c- 2 ( Pa +Pb) 2 (7.33) 
* := c- 2 (p A -pc) 2 (7.34) 
u := c- 2 (pa- Pd ) 2 (7.35) 

as independent variables. Now conservation of energy momentum implies that 
Pa ■ (pc + Pd) = PA ■ (pa + Pb)- Hence 

s + t + u = m 2 A + m 2 B + m 2 c + m 2 D (7.36) 

gives the relation between the three Mandelstam variables. Although the Man- 
delstam variables are important for theoretical work, the parameters that are 
measured in the laboratory are incoming and outgoing energies and scattering 
angle. It therefore becomes useful to express these laboratory parameters in 
terms of the Mandlestam variables. 



Energies in terms of s. 

By the definition of s, we see that the total energy in the center of momentum 
system is given by 

E CM + E CM = E CM + E CM = (? 37) 

To find the energy of A in the center of momentum system, we again employ 
the trick of thinking of a fictitious particle of energy momentum pc+PD (hence 
at rest in the CM system and with mass y/s) decaying into particles A and B 
with energy momenta pa and pb and apply (7.30) to obtain 

E CM = (s + m 2 A -m 2 B )c 2 _ ^ 
2y s 

To find the laboratory energy of particle A (where particle B is at rest) we go 
through an argument similar to that used in deriving (7.30). As usual we will 
do the derivation in a system of where c = 1. We have ps = (ms, 0,0,0) and 

PA = PC + PD - PB SO 

m A = (pc + Pd) 2 + m 2 B - 2p B ■ (p c + pd) 
= s + m 2 B -2pb(pa+Pb) 
= s — m 2 B — 2itlbEa 



Solving gives 



152 



CHAPTER 7. SPECIAL RELATIVITY 



Reverting to general coordinates gives 

2m B 

Angles in terms of Mandelstam variables. 

We will study the special case where mc = m-A and mj = mj, for example 
when the outgoing and incoming particles are the same. The variable t is called 
the momentum transfer. Conservation of energy momentum says that 

q -=PA - Pc = Pd - Pb 

and the definition that 

t = q\ 

(We work in units where c = 1.) Squaring both sides of p D = p B + q and using 
the assumption that raj = mo gives 

t = -2p B ■ q. 

In the Laboratory frame where B is at rest so ps — (ms, 0,0,0) this becomes 

t=-2m B {E A -E c ). 

Suppose that A is a very light particle, practically of mass zero, so that pa = 
(||k A ||,k A ) andp c = (||kc||,kc). Then 

t = g 2 

= (||k A ||-||k c ||) 2 -||k A -k c || 2 

- -2||k^||||k c ||(l-cos0) 

= -4||k A ||||k c ||sin 2 0/2, 

where 8, the scattering angle, is the angle between k^ and kc- Substituting 
t = — 2m_B(||k J 4|| — ||kc||) into the above expression gives 

2m B (||k A || - ||k c ||) = 4||k A ||||k c ||sin 2 6/2 

or 

llk^ll ||k A || 2 , 

hi = 1 + 2 n_jiii s i n 2 0/2. 



||kc|| m B 
If we assume that ||k^|| is small in comparison to ms then 

||k A || = ||kc|| 

and we get 

t = -4||k A || 2 sin 2 0/2. (7.40) 

This formula is for a light particle of moderate energy scattering off a massive 
particle. 
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The general expression is a bit more messy, but not much. We wish to find 
the angle, 8, between the incoming momentum and the outgoing momentum 
PC in the rest frame of B in terms of the Mandelstam variables and the masses. 
In this frame we have 

Pa • Pc = E A E C - ||pa||||PcI|cos0. 

So we will proceed in two steps: first to express Ea,Ec, ||pa||,||pc|| in terms 
of the four dimensional scalar products (and the masses) and then to express 
the scalar products in terms of the Mandlestam variables. We have 



E A 
Ec 



Pa-Pb 
m B 

PC • Pb 
m B 



PA = \&A~ m 



4&A 



IPcil = \ E c- m c so 



cos 6 



c 

E A E C -pa-Pc 



^(E 2 A -m\)(E 2 c -m 2 c ) 
If we denote the common value of rriA and mc by m we have 

cos0 = (PA-PB)(pc-p B )-mlp A -pc (? 41) 

ViiPA - Pb) 2 - m B m 2 } [(p c ■ Pb) 2 - m B m 2 } 

To complete the program observe that it follows from the definitions that 
%PA - Pb = s-m 2 A -m, B 

2 i ™2 



%PA ■ Pc — m A + m c — * an d 

2 
B 



2pc - Pb = rn A + m% - u. 



Substituting these values into (7.41) gives us our desired expression. 

By the way, equation (7.41) has a nice interpretation in terms of the scalar 
product induced on the space of exterior two vectors. We have 

(PA A_pb) • (pc Ap s ) = (pa ■ Pc)(pb ■ Pb) - (pa ■ Pb)(pc ■ Pb) 

= -[(PA ■ Pb)(pc ■ Pb) - m 2 B p A • Pc] while 

\\pa^Pb\\ 2 = m 2 B m 2 A - (p A ■ Pb) 2 and 

Wpc^PbW 2 = m 2 B m 2 c - (p c ■ p B ) 2 . 

(The two last expressions are negative, since the two plane spanned by two 
timelike vectors has signature + - .) We can thus write (7.41) as 

cosg=- (P^PB)-(PA^PB) 

V\\pa^Pb\\ 2 \\pc /\Pb\\ 2 
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7.3 Scattering cross-section and mutual flux. 

Let us go back to the expression | \p A A ps \ \ 2 = m 2 B m 2 A — (p A -ps) 2 from the end 
of the last section. In terms of a given space time splitting with unit timelike 
vector do, we can write 

Pa = E a 8 + pa, 
Pb = E B d + p B so 
PaI\Pb = Pa A ps + <9 A (E A p B - E B p A ) and hence 

2 

R 3 



\pAf\PB\\ 2 = \\PA X Pb||r3 - \\E A p B - E B p A n2 



where x denotes the cross product in R 3 and the norms in the last expression 
are the three dimensional norms. In a frame where the momenta are aligned, 
such as a the CM frame where p,4 = Pb or the laboratory frame where pb = 0, 
we have p A x Pb = 0. Recall our relativistic definition of velocity as p/E. So 
in a frame where the momenta are aligned we have 

11-EaPb - £bPa||r.3 = E A E B \\v\\ R 3 where 

1 1 
v = —pa - ~^Pb 

&A &B 

is the mutual velocity. So in such a frame we have 

-WpaAPbW 2 = E A E 2 B \\v\\ 2 n3 . (7.43) 

We want to apply this to the following situation which we first study in a fixed 
frame where the momenta are aligned. 

A beam of particles of type A impacts on a target of particles of type B and 
some events of type / are observed. Let rif denote the number of events of type 
/ per unit time, so nf has dimensions (time) -1 . We assume that the target 
density is ps with dimensions (vol) -1 and the beam density is p A . We assume 
that the beam is well collimated and that all of its particles have approximately 
the same momentum, p^. The mutual flux per unit time (at time t)is 



v J p A (t,x)p B (t,x)d 3 x 



where 

v ■= IMIr 3 

and v is the mutual velocity. So the mutual flux has dimensions 

(distance) 1 



(timc)(vol.) (time)(area) 

Thus 

n f 



v J p A (t,x.)p B (t,x)d 3 x 
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has the dimensions of area. Similarly, if we integrate the numerator and denomi- 
nator with respect to time, the corresponding quotient will have the dimensions 
of area. Let Nf := J rifdt be the total number of events of type /. Then, 
integrating the denominator as well, 

Nt 

°f ■= —r 7 \ . . ? . (7.44) 

1 v J p A {x)p B {x)d 4 x 

is called the total cross-section for events of type /. So it has the dimensions 
of area. The convenient unit is the barn (as in "he can't hit the side of a barn" ) 
where 

1 barn = 1CT 24 cm 2 . 

The denominator in the expression for the total cross-section is called the 
mutual flux. It has a more invariant expression as follows: In the frame where 
the target particles are at rest, the "current" of the target particles (a three 
form) has the expression 

Jb = pBdx Ady Adz 
while the current for the beam will have an expression of the form 

J A — PAdx A dy A dz + dt A (J)a z dx A dy + ja v dz A dx + Ja x dy A dz). 

So 

PaPb = —J a ■ Jb- 
Also, in this frame, EaEb = Pa ■ Pb- So by (7.43) we have 



mutual flux = 



Wpa^PbW f t T ,4 
PA-PB J 



(7.45) 



It is the function of any dynamical theory in quantum mechanics to make some 
predictions about the expected number of events of type /. 
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Chapter 8 

Die Grundlagen der Physik. 



This was the title of Hilbcrt's 1915 paper. It sounds a bit audacious, but let us 
try to put the ideas in a general context. We need to do a few computations in 
advance, so as not to disrupt the flow of the argument. 

8.1 Preliminaries. 

8.1.1 Densities and divergences. 

If we regard R n as a differentiable manifold, the law for the change of variables 
for an integral involves the absolute value of the Jacobian determinant. This 
is different from the law of change of variables of a function (which is just 
substitution). [ But it is close to the transition law for an n-form which involves 
the Jacobian determinant (not its absolute value).] For this reason we can not 
expect to integrate functions on a manifold. The objects that we can integrate 
are known as densities. We briefly recall two equivalent ways of defining these 
objects: 

• Coordinate chart description. A density p is a rule which assigns to 
each coordinate chart (U, a) on M (where U is an open subset of M and 
a : U — > R" ) a function p a defined on a(U) subject to the following 
transition law: If (W, fi) is a second chart then 

p a (v) = p /3 (/3oaT 1 (t>)) • |det Jp oa -i\{v) for v e a(U W) (8.1) 

where Jg oc «-i denotes the Jacobian matrix of the diffeomorphism 

/?oa _1 : j3(Uf)V) -» a(Uf)V). 

Of course (8.1) is just the change of variables formula for an integrand in 
R™. 

• Tangent space description. If V is an n-dimensional vector space, let 
| A V* | denote the space of (real or complex valued) functions of n-tuplets 
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of vectors which satisfy 

<j(Av 1 ,...,Av n ) = \ detA\a(v!,...,v n )). (8.2) 

The space | A V*| is clearly a one-dimensional vector space. A density p 
is then a rule which assigns to each x £ M an element of | A TM* \ . 

The relation between these two descriptions is the following: Let p be a density 
according to the tangent space description. Thus p x G | hTM* \ for every x G M. 
Let (U, a) be a coordinate chart with coordinates x 1 , . . . , x n . Then on U we have 
the vector fields 

d d 
dx 1 ' ' ' ' ' dx n ' 

We can then evaluate p x on the values of these vector fields at any x £ U, and 
so define 

d_\ f_d_ 



p a (a(x)) = p x 



If (W, (3) is a second coordinate chart with coordinates y 1 , . . . ,y n then on U(l W 
we have 



(V//' 9 



and 



9a;J ^— ^ Sx- 3 dy l 
Jpoa- 1 



dy l 
3x3 



so (8.1) follows from (8.2). 

If (U, a) is a coordinate chart with coordinates x 1 , . . . , x n then the density 
defined on U by p a = 1, that is by 



= 1 V a; € (7 



is denoted by dx. Every other density then has the local description Gdx on U 
where G is a function. 

If <t> : N — > M is a diffcomorphism and if p is a density on M, then the pull 
back cj)*p is the density on N defined by 

(<t>*P)z{vi,-- -,v n ) := jO^ (z )(d^(t)i), . . .,#*(«„)) z e N > v 1 ,...,v n £ TN Z . 

(It is easy to check that this is indeed a density i.e. that (8.2) holds at each 
z € N. ) 

In particular, if X is a vector field on M generating a one parameter group 

t I ► cxp 
of diffcomorphisms, we can form the Lie derivative 

n d a* 
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We will need a local description of this Lie derivative. We can derive such a 
local description from Weil's formula for the Lie derivative of a differential form 
by the following device: Suppose that the manifold M is orientable and that 
we have chosen an orientation of M . This means that we have chosen a system 
of coordinate charts such that all the Jacobian determinants det Jpoa- 1 are 
positive. Relative to this system of charts, we can drop the absolute value sign 
in (8.1) since det Jpoa- 1 > 0- But (8.1) without the absolute value signs is just 
the transition law for an n-form on the n-dimcnsional manifold M. In other 
words, once we have chosen an orientation on an orientable manifold M we 
can identify densities with n-forms. A fixed chart (U, a) carries the orientation 
coming from R™ and our identification amounts to identifying the density dx 
with the n-form dx 1 A • • • A dx n . 

If t is an n-form on an n-dimcnsional manifold then Weil's formula 

D x t = i(X)dr + di(X)r 

reduces to 

D x t = di(X)r 

since dr — as there are no non-zero (n+1) forms on an n-dimcnsional manifold. 
If d d 

dx 1 dx n 

and 

r = Gdx 1 A • • • A dx n 
in terms of local coordinates then an immediate computation gives 

di(X)r= Ij^d^GX 1 ) J dx 1 A ••• A cfe" (8.3) 



\i=i 



where 



^ ' dx i ' 



It is useful to express this formula somewhat differently. It makes no sense 
to talk about a numerical value of a density p at a point x since p is not a 
function. But it does make sense to say that p does not vanish at x, since if 
p a (a(x)) then (8.1) implies that pp((3(x)) ^ 0. Suppose that p is a density 
which does not vanish anywhere. Then any other density on M is of the form 
/ • p where / is a function. If X is a vector field, so that Dx p is another density, 
then Dxp is of the form fp where / is a function, called the divergence of the 
vector field X relative to the non- vanishing density p and denoted by div p (X). 
In symbols, 

D xP =(dw p (X))-p. 
We can then rephrase (8.3) as saying that 

n 

d\Y p {X)=-Y,d i {GX i ) (8.4) 
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in a local coordinate system where 

x = x 1 d 1 + --- + x n d n 

is the local expression for X and 

Gdx 

is the local expression for p. 

8.1.2 Divergence of a vector field on a semi-Riemannian 
manifold. 

Suppose that g is a scmi-Ricmann metric on an n— dimensional manifold, M. 
Then g determines a density, call it g, which assigns to every n tangent vectors, 
£i, . . . ,£„ at a point x the "volume" of the parallelepiped that they span: 

g: &,...,£„.-» | det «&,£,•» I'. (8.5) 

If we replace the £j by A& where A : TM X — > TM X the determinant is replaced 

by 

det((A^,AQ) =det = (dct A) 2 det((&, &•)) 

so we see that (8.2) is satisfied. So g is indeed a density, and since the metric is 
non-singular, the density g does not vanish at any point. 

So if X is a vector field on M, we can consider its divergence div g (X) with 
respect to g. Since g will be fixed for the rest of this subsection, we may drop 
the subscript g and simply write div X. So 

dWX ■ g = D x g. (8.6) 

On the other hand, we can form the covariant differential of X with respect 
to the connection determined by g, 

VI. 

It assigns an clement of Horn (TM P , TM p ) to each p G M according to the rule 

The trace of this operator is a number, assigned to each point, p, i.e. a function 
known as the "contraction" of VI, so 

C(VX):=f, f(p):=tv(^V 6 X). 

We wish to prove the following formula 

divX = C(VX). (8.7) 

We will prove this by computing both sides in a coordinate chart with coordi- 
nates, say, x 1 , . . . , x n . Let dx = dx 1 dx 2 ■ ■ ■ dx n denote the standard density (the 
one which assigns constant value one to the <9i, . . . , d n , di := d/dx 1 ). Then 



g = Gdx, G=\det{(d i ,d j ))\ 1 * = (edet^,^)))^ 
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where 

e := sgndet((dj, #,-)). 
Recall the local formula (8.4) for the divergence: 



G 

Write 



divX=±5>(X*G). 



A :=det((^,9,)) 

so 

_i 1 d( £ A) 

1 1 <9A 

2 A ~dx i 

independent of whether e = 1 or — 1. To compute this partial derivative, let us 
use the standard notation 

9ij '■= ( d i, d j) 



so 



where we have expanded the determinant along the i— th row and the A y are 
the corresponding cofactors. If we think of A as a function of the n 2 variables, 
gij then, since none of the A^ (for a fixed i) involve g^, we conclude from the 
above cofactor expansion that 



9 gij 

and hence by the chain rule that 



dA 

A 11 (8.8) 



dx k ^—^ dx k 

ij 

But 

the inverse matrix of (gij), which is usually denoted by 

(g kl ) 

so we have 



dx k ^-^ dx k 



162 



CHAPTER 8. DIE GRUNDLAGEN DER PHYSIK. 



Recall that 



so 



or 



or 

V a = I V ar ( + - ^ 

be- 2 ^ 9 \dx c 0x b 0x r 

r y 

E-pa _ 1 ar ^9ar 

ar 

(^) 

a 

On the other hand, we have 

so 

proving (8.7). 

For later use let us go over one step of this proof. From (8.8) we can conclude, 
as above, that 

<"»> 

8.1.3 The Lie derivative of of a semi-Riemann metric. 

We wish to prove 

L vS = SW(Vl). (8.11) 

The left hand side of this equation is the Lie derivative of the metric g with 
respect to the vector field V . It is a rule which assigns a symmetric bilinear 
form to each tangent space. By definition, it is the rule which assigns to any 
pair of vector fields, X and Y, the value 

(L v g)(X, Y) = V(X, Y) - ([V, X], Y) - (X, [V, Y]>. 

The right hand side of (8.11) means the following: V J, denotes the linear 
differential form whose value at any vector field Y is 

(Vl)(Y) := (V, Y). 

In tensor calculus terminology, J. is the "lowering operator" , and it commutes 
with covariant differential. Since J, commutes with V, we have 

V(V i)(X,Y) = Vx(V i)(Y) = (V x V,Y). 
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The symbol S in (8.11) denotes symmetric sum, so that the right hand side of 
(8.11) when applied to X, Y is 

(W X V,Y) + (W Y V,X). 

But now (8.11) follows from the identities 

L V (X,Y) = V(X,Y) = (V V X,Y) + (X,V V Y) 
V V X-[V,X] = V X V 
V V Y-[V,Y] = V Y V. 

8.1.4 The covariant divergence of a symmetric tensor field. 

Let T be a symmetric "contravariant " tensor field (of second order), so that in 
any local coordinate system T has the expression 

T = ^T ij didj, /'••' '/••". 

If 9 is a linear differential form, then we can "contract" T with 9 to obtain a 
vector field, T • 9: In local coordinates, if 

8 = ciidx 1 

then 

T-e = ^2T ij aj di. 

We can form the covariant differential , VT which then assigns to every linear 
differential form a linear transformation of the tangent space at each point, and 
then form the contraction, C(VT). (Since T is symmetric, we don't have to 
specify on "which of the upper indices" we are contracting.) We define 

divT := C(VT), 

called the covariant divergence of T. It is a vector field. The purpose of this 
section is to explain the geometrical significance of the condition 

divT = 0. (8.12) 

If S is a "covariant " symmetric tensor field so that 

S = ^S l3 dx l dx 3 

in local coordinates, let S • T denote the double contraction. It is a function, 
given in local coordinates by 



S»T = Y,SijT ij . 
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Thus T can be regarded as a linear function on the space of all covariant sym- 
metric tensors of compact support by the rule 

Sh / S.Tg, 

Jm 

where g is the volume density associated to g. Let V be a vector field of compact 
support. Then Lyg is a symmetric tensor of compact support. We claim 

Proposition 10 Equation (8.12) is equivalent to 

f (L vS ).Tg = (8.13) 
for all vector fields V of compact support. 

Proof. Let := V J. so T • is a vector field of compact support, and so 

/ C(V(T ■ 0))</ = / L T . g = O 
Jm Jm 

by the divergence theorem. (Recall our notation: the symbol • denotes a "single" 
contraction, so that T • 9 is a vector field. ) 
On the other hand, 

V(T-0) = (VT) -0 + T- V0. 
Apply the contraction, C: 

2C(T • V0) = 2T • V0 
= T • Lyg, 

using the fact that T is symmetric and (8.11). So 

2T. W |= T.L yg 

and and hence J M T • Lygg = for all V of compact support if and only if 
J M (div T • 9)g = for all of compact support. If div T ^ 0, we can find a 
point p and a linear differential form such that div T • 9(p) > at some point, 
p. Multiplying by a blip function <f> if necessary, we can arrange that has 
compact support and divT > so that J M (div T • 9)g > 0. 

Let us write £t for the linear function on the space of smooth covariant 
tensors of compact support given by 

MS) := / S.Tg. 

Jm 

We can rewrite (8.13) as 



£(Lyg) = W of compact support 



(8.14) 
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when £ = i'Y. 

We can ask about condition (8.14) for different types of linear functions, £. 
For example, consider a "delta tensor concentrated at a point", that is a linear 
function of the form 



where t is a ( "contravariant" ) symmetric tensor defined at the point p G M. We 
claim that no (non-zero) linear function of this form can satisfy (8.14). Indeed, 
let W be a vector field of compact support and let be a smooth function which 
vanishes at p. Set V = ^W.Then 

VV [=d<j>®W I +<f>WW i 

and the second term vanishes at p. Therefore condition (8.14) says that 

= t • (d0(p) ® W I (p)) = [i • W I (p)] • #(p). 

This says that the tangent vector t ■ (W i)(p) yields zero when applied to the 
function </>: 



Now given any tangent vector, w € TM p we can always find a vector field W 
of compact support such that W(p) = w. Hence the preceding equation implies 
that t ■ w I— Vui e TM p which implies that t = 0. 

Let us turn to the next simplest "delta tensor concentrated on a 

curve". That is, let 7 : I — > M be a smooth curve and let r be a continuous 
function which assigns to each s £ / a symmetric contravariant tensor, r(s) at 
the point ■y(s). Define the linear function £ T on the space of covariant symmetric 
tensor fields of compact support by 



Let us examine the implications of (8.14) for £ = £ T . Once again, let us choose 
V — 4>W, this time with <f> — on 7. We then get that 



for all vector fields W and all functions cj> of compact support vanishing on 7. 
This implies that for each s, the tangent vector t(s) • w j is tangent to the curve 
7 for any tangent vector w at 7(5). (For otherwise we could find a function <f> 
which vanished on 7 and for which [r(s) • w](f) ^ 0. By extending w to a vector 
field W with W(7(s)) = w and modifying (f> if necessary so as to vanish outside 
a small neighborhood of j(s) we could then arrange that the integral on the left 
hand side of the preceding equation would not vanish.) 



*(S) = S(p) • t 




t-(Wi)( P ) = 0. 
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The symmetry of r(s) then implies that r(s) = c(s)~/'(s) ® 7 ; (s) for some 
scalar function, c. (Indeed, in local coordinates suppose that V(s) = 5Zi>*dj 7 ( s ) 
and r(s) = ^ 3 ^i-y{s)dj-y( s )- Applied successively to the basis vectors w — <9 i7 ( s ) 
we conclude that t 1 ^ — c l v 3 and hence from t % i = P l that = cP z .) 

Let us assume that r(s) ^ so c(s) 7^ 0. Changing the parameterization 
means multiplying 7'(s) by a scalar factor, and hence multiplying t by a positive 
factor. So by reparametrizing the curve we can arrange that r = ±7' ® 7'. 
To avoid carrying around the ± sign, let us assume that r = 7' ® 7'. Since 
multiplying r by — 1 does not change the validity of (8.14), we may make this 
choice without loss of generality. 

Again let us choose V = 4>W, but this time with no restriction on (f>, but let 
us use the fact that r(s) = "f'(s) <g> 7'(s). We get 

r-Wj = r • [d<t> ® W I +<j>VW |] 

= (7»(7 / ,W)+^VyW,7 / ) 

= yw7',w))-wv f y). 

The integral of this expression must vanish for every vector field and every 
function (f> of compact support. We claim that this implies that Vyy' = 0, that 
7 is a geodesic! Indeed, suppose that Vy( s )7'(s) 7^ for some value, s , of s. 
We could then find a tangent vector w at 7(so) such that (w, V 7 '( So )7'(so)) = 1 
and then extend wtoa vector field W, and so (W'V 7 /( s )7'(s)) > for all s near 
So- Now choose <fi > with 0(s o ) = 1 and of compact support. Indeed, choose 
<p to have support contained in a small neighborhood of j(so), so that 

J 7' (0(7', WO) - (0(7', W)) (7(6)) - (0(7', W)) (7(a)) = 

where a < s < b are points in / with 7(b) and 7(a) outside the support of <p. 
We are thus left with 

ir{LMs)) = - I 0<W,V 7 /y)dfl < 0. 

J a 

Conversely, if 7 is a geodesic and r = 7' 7' then 

r. vi/ 4= (Vyv,y> - y(vw') - (v, v 7 ,y>- 

The second term vanishes since 7 is a geodesic, and the integral of the first term 
vanishes so long as 7 extends beyond the support of V or if 7 is a closed curve. 
We have thus proved a remarkable theorem of Einstein, Infeld and Hoffmann 
(1938) 

Theorem 1 7/r is a continuous (contravariant second order) symmetric tensor 
field along a curve 7 whose associated linear function, i r satisfies (8. 14-) then 
we can reparametrize 7 so that it becomes a geodesic and so that r = ±7' ® 7'. 
Conversely, if r is of this form and if is unbounded or closed then £ T satisfies 
(8.14). 

(Here "unbounded" means that for any compact region, K, there are real num- 
bers a and b such that j(s) & K, VS > b or < a.) 
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8.2 Varying the metric and the connection. 

We will regard the space of smooth covariant symmetric tensor fields S such 
as those we considered in the preceding section as the "compactly supported 
piece" of the tangent space to a given metric g. This is to be interpreted in 
the following sense: Let M. denote the space of all scmi-Ricmann metrics on a 
manifold, M, say all with a fixed signature. If g e M. is a particular metric, 
and if S is a compactly supported symmetric tensor field, then 

g + ts 

is again a metric of the same signature for sufficiently small \t\. So we can regard 
S as the infinitesimal variation in g along this "line segment" of metrics. On 
the other hand, if gt is any curve of metrics depending smoothly on t, and with 
the property that gt = g outside some fixed compact set, K, then 

at |t=o 

is a symmetric tensor field of compact support. 

So we will denote the space of all compactly supported smooth fields of 
symmetric covariant two tensors by 

TM. compact- 

Notice that we have identified this fixed vector space as the (compactly sup- 
ported) tangent space at every point, g in the space of metrics. We have "triv- 
ialized" the tangent bundle to A4. 

The space of all (symmetric) connections also has a natural trivialization. 
Indeed, let V and V be two connections. Then 

V fx Y - V fx Y = fV x Y - fV x Y. 

In other words, the map 

A: (X,Y) = V X Y -V X Y 

is a tensor; its value at any point p depends only on the values of X and Y at 
p. We can say that 

A = V - V 

is a tensor field, of type T* <g> T* ® T (one which assigns to every tangent vector 
at p € M an clement of Horn (TM p , TM p )). 

Conversely, if A is any such tensor field and if V is any connection then 
V = V + A is another connection. Thus the space of all connections is an 
affine space whose associated linear space is the space of all A's. We will be 
interested in symmetric connections, in which case the A's are restricted to being 
symmetric: A X Z = AzX. (Check this as an exercise.) Let A denote the space 
of all such (smooth) symmetric A and let C denote the space of all symmetric 
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connections. Then we can identify the "tangent space" to C at any connection 

V with the space A, because in any affine space we can identify the tangent 
space at any point with the associated linear space. In symbols, we may write 

TC V = A, 

independent of the particular V. Once again we will be interested in variations 
of compact support in the connection, so we will want to consider the space 

•^compact 

consisting of tensor fields of our given type of compact support. 

The Levi-Civita map assigns to every Ricmann metric a symmetric connec- 
tion. So it can be considered as a map, call it L.C., from metrics to connections: 

L.C. : M^C. 

The value of L.C. (g) at any point depends only on gij and its first derivatives at 
the point, and hence the differential of the Levi-Civita map can be considered 
as a linear map 

d(L.C.)g : TMqqt^pq^i ~ * -^compact- 

(The spaces on both sides are independent of g but the differential definitely 
depends on g.) In what follows, we will let A denote the value of this differential 
at a given g and S e TM compact : 

A:= d(L.C.) s [S}. 

As an exercise, you should compute the expression for A in terms of VS where 

V = L.C.(g) is the Levi-Civita connection associated to g. 

The map R associates to every metric its Riemann curvature tensor. The 
map Ric associates to every metric its Ricci curvature. For reasons that will 
soon become apparent, we need to compute the differentials of these maps. 

The curvature is expressed in terms of the connection: 

Rx.Y — V[x,Y] — [VxiVy]. 

So we may think of the right hand side of this equation as defining a map, 
curv, from the space of connections to the space of tensors of curvature type. 
Differentiating this expression using Leibniz's rule gives, for any A e A, 

(cfcurvy [A]) (X, Y) = A {x ,y\ - A X V Y - ^xA Y + A Y V x + V y A x . 
We have 

[X,Y] = V X Y-V Y X 

so 

A[X,Y] — Ay x y — A VyX . 
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On the other hand, the covariant differential, VA of the tensor field A with 
respect to the connection, V is given by 

{VA)(X, Y)Z = V X {A Y Z) - A VxY Z - A Y V X Z 
or, more succinctly, 

(VA)(X, Y) = V X A Y - A VxY - A Y V X . 
From this we see that 

(dcurv v [A]) (X,Y) = (VA)(Y,X) - (VA)(X,Y). 

If we let VA denote the tensor obtained from VA by VA{X, Y) = VA(Y, X) 
we can write this equation even more succinctly as 

dcuvv v [A] = VA- VA. 

If we substitute A = d(L.C.) g [S] into this equation we get, by the chain rule, 
the value of <£R g [S]. Taking the contraction, C, which yields the Ricci tensor 
from the Riemann tensor, we obtain 

dRic g [S] = C{VA-VA). 

Let g denote the contravariant symmetric tensor corresponding to g, the 
scalar product induced by g on the cotangent space at each point. Thus, for 
example, the scalar curvature, S, is obtained from the Ricci curvature by con- 
traction with g: 

S = g • Ric. 

Contracting the preceding equation with g and using the fact that V commutes 
with contraction with g and with C we obtain 

g.dRic g [S] = C(W) 

where V is the vector field 

V:=C(A) T-g- A. 

We have C{yV) = divV. Also V has compact support since S does. Hence we 
obtain, from the divergence theorem, the following important result: 

( g»(2Ric g [S] g = 0. (8.15) 
Jm 

8.3 The structure of physical laws. 
8.3.1 The Legendre transformation. 

Let / be a function of one real variable. We can consider the map t f'(t) which 
is known as the Legendre transformation, or the "point slope transformation" , 
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£(/), associated to /. For example, if / = \kt 2 then the associated Legendre 
transformation is the linear map t \— > kt. As for any transformation, we might 
be interested in computing its inverse. That is, find the (or a) point t with a 
given value of fit). 

For a function, /, of two variables we can make the same definition and pose 
the same question: Define £(/) as the map 



(df/dx, df/dy). 



Given (a, b) we may ask to solve the equations 

df/dx = a 
df/dy = b 

*»(:)• 

The general situation is as follows: Suppose that M. is a manifold whose 
tangent bundle is trivialized, i.e. that we are given a smooth identification of 
TM. with M. x V, all the tangent spaces are identified with a fixed vector space, 
V. Of course this also gives an identification of all the cotangent spaces with the 
fixed vector space V*. In this situation, if F is a function on Ai, the associated 
Legendre transformation is the map 

C{F):M->V\ xi-»dF x . 

In particular, given I e V*, we may ask to find xeAi which solves the equation 

dF x = I. (8.16) 

This is the "source equation" of physics, with the caveat that the function F 
need not be completely defined. Nevertheless, its differential might be defined, 
provided that we restrict to "variations with compact support" as is illustrated 
by the following example: 

In Newtonian physics, the background is Euclidean geometry and the objects 
are conservative force fields which arc linear differential forms that are closed. 
With a mild loss of generality let us consider "potentials" instead of force fields, 
so the objects are functions, <f) on Euclidean three space. Our space M con- 
sists of all (smooth) functions. Since M. is a vector space, its tangent space is 
automatically identified with M. itself, so V = M. The force field associated 
with the potential is — d<p, and its "energy density" at a point is one half the 
Euclidean length 2 . That is, the density is given by 



where subscript denotes partial derivative. We would like to define the function 
F to be the "total energy" 



F(<t>) = lf {4>l + <i>l + 4>l)dxdydz 
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but there is no reason to believe that this integral need converge. However, 
suppose that s is a smooth function of compact support, K. Thus s vanishes 
outside the closed bounded set, K. For any bounded set, B, the integral 

F B {4>) ~± J + + <fi)dxdydz 



converges, and the derivative 

dF B [<P + ts] 

dt |t=o 



/ {<j> x s x + 4>yS y + 4> z s z )dxdydz 

JR 3 



exists and is independent of B so long as B D K. So it is reasonable to define 
the right hand side of this equation as dF^ evaluated at s: 

dF<p[s] := / (<f> x s x + 4> y Sy + <j) z s z )dxdydz 

even though the function F itself is not well defined. Of course to do so, we 
must not take V = M. but take V to be of the subspace consisting of functions 
of compact support. 

A linear function on V is just a density, but in Euclidean space, with Eu- 
clidean volume density dxdydz we may identify densities with functions. Sup- 
pose that p is a smooth function, and we let £ p be the corresponding element of 
V*, 



£ p (s) = / spdxdydz. 
Equation (8.16) with I = £ p becomes 

/ {4>x s x + 4>y s y + <j> z s z )dxdydz = / spdxdydz Vs G V, 

JR3 JR3 

which is to be regarded as an equation for <j> where p is given. We have 

4> X S X + <t>y s y + 4>zS z = {<j) x s) x + {<j>yS)y + {<j> z s) z - sA(j) 

where A is the Euclidean Laplacian, 

A</> = (j) XX + 4>yy + (j> ZZ . 

Thus, since the total derivatives (s<p x ) x etc. contribute zero to the integral, 
equation (8.16) is the Poisson equation 

As we know, a solution to this equation is given by convolution with the 1/r 
potential: 

<k*> y > z ^^f ^ {x _ 2 + {y l fl) 2 + [z _ c) 2 d ^ d(: 

if p has compact support, for example, so that this integral converges. In this 
sense Euclidean geometry determines the 1/r potential. 
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8.3.2 The passive equations. 

Symmetries of the function F may lead to constraints on the right hand side of 
(8.16). In our example of a function of two variables, suppose that the function 
/ on the plane is invariant under rotations. Thus / would have to be a function 
of the radius, r, and hence the right hand side of (8.16) would have to be 
proportional to dr, and in particular, vanish on vectors tangent to the circle 

through the point ^ ^ ^ . 

More generally, suppose that Q is group acting on A4, and that the function 
F is invariant under the action of this group, i.e. 

F(a • x) = F(x) Va e Q. 

Let O = Q ■ x denote the orbit through x, so Q ■ x consists of all points of the 
form ax, a E Q. Then the function F is constant on O and so dF x must vanish 
when evaluated on the tangent space to O. We may write this symbolically as 



I e (TO)° (8.17) 

if (8.16) holds. Of course, in the infinite dimensional situations where we want 
to apply this equation, we must use some imagination to understand what is 
meant by the tangent space to the orbit. 

We want to consider what happens when we modify t by adding to it a 
"small" element, fi e V*. Presumably the solution x to our "source equation" 
(8.16) would then be modified by a small amount and so the tangent space to 
the orbit would change. We would then have to apply (8.17) to I + using 
the modified tangent space. [One situation where disregarding this change in 
x could be justified is when 1 = 0. Presumably the modification of x will be 
of first order in /i, and hence the change in (8.17) will be a second order effect 
which can be ignored if /i is small.] 

A passive equation of physics is where we apply (8.17) but disregard the 
change in the tangent space and so obtain the equation 

H e (TM)° X . (8.18) 

The justification for ignoring the non-linear effect of fi of x may be problematical 
from our abstract point of view, but the equation we have just obtained for the 
passive reaction of \i to the presence of x is a powerful principle of physics. 
About half the laws of physics are of this form 

We have enunciated two principles of physics, a source equation (8.16) which 
amounts to inverting a Legendre transformation, and the passive equation (8.18) 
which is a consequence of symmetry. We now turn to how Hilbert and Einstein 
implemented these principles for gravity. 
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8.4 The Hilbert "function" . 

The space M is the space of Lorentzian metrics on a given manifold, M. Hilbert 
chooses as his function 

F(G) = - f Sg, 5 = g-Ric(g). 

As discussed above, this "function" need not be defined since the integral in 
question need not converge. But the differential 

dF s [S] 

will be defined when evaluated on a variation of compact support. The integral 
defining F involves g at three locations: in the definition of the density g, in 
the dual metric g and in Ric. Thus, by Leibniz's rule 

-dF s [S] = [ g • Ric(g)d 5 [S] + / dg[S] • Ric(g) + [ g • dRiCg[S]. 

JM JM JM 

We have already done the hard work involved in showing that the third 
integral vanishes, equation (8.15). So we are left with the first two terms. 
As to the first term, the coordinate free way of rewriting (8.10) is 

dg s [S}= 1 -g-Sg. 

As to the second term, recall that in local coordinates, g is given by ^ g^didj 
where {g l i) is the inverse matrix of g^. So we recall a formula we derived for 
the differential of the inverse function of a matrix. If inv denotes the inverse 
function, so 

inv(S) = B-\ 

then it follows from differentiating the identity 

BB- 1 = I 

using Leibniz's rule that 

dmv B [C] = -B- X CB- X . 

It follows that the differential of the function g g when evaluated at S 
is S 1"T, the contravariant symmetric tensor obtained from S by applying the 
raising operator (coming from g) twice. Now 

(S \X) • Ric = S • Ric W . 



So if we define 



RIC := Ric ft 



174 



CHAPTER 8. DIE GRUNDLAGEN DER PHYSIK. 



to be the contravariant form of the Ricci tensor we obtain 

dF g [S}= f (RlC-^g)-Sg. (8.19) 

This is left hand side of the source equation (8.16). The right hand side is a 
linear function on the space T(Ai) com p &c ^. We know that if T is a smooth 
symmetric tensor field, then it defines a linear function on T(A4) com p ac ^ given 
by 

£t(S)= / S-Tg. 

Thus for i = I'y equation (8.16) becomes the celebrated Einstein field equations 

RIC - X -S% = T (8.20) 

So if we regard the physical objects as scmiRiemann metrics, and if we believe 
that matter determines the metric, by a source type equation, then matter 
should be considered as a linear function on T(A4) com p ac t- In particular a 
"smooth" matter distribution is a contravariant symmetric tensor field. If we 
believe that the laws of physics are described by the function given by Hilbert, 
we get the Einstein field equations. Modifying the function would change the 
source equations. For example, if we replace S by S + c where c is a constant, 
this would have the effect of adding a term ^cg to the left hand side of the field 
equations. This is the notorious "cosmological constant" term. 

We will take our group of symmetries to be the group of diffcomorphisms of 
M of compact support - diffcomorphisms which are the identity outside some 
compact set. Such transformations preserve the function F. 

If V is a vector field of compact support which generates a one parameter 
group, (p s of transformations, then these transformations have compact sup- 
port, and the fact that the function F is invariant under these transformations 
translates into the assertion that dF g [L v g] = 0. 

In other words, the "tangent space to the orbit through g" is the subspace 
of T(M) com p ac f i consisting of all Lyg where V is a vector field of compact sup- 
port. From the results obtained above we now know that the passive equation 
translates into 

divT = 

for a smooth tensor field and into 

T = ±7' <g> 7', 7 a geodesic 

for a continuous tensor field concentrated along a curve. These results are 
independent of the choice of F. 



8. 5. SCHRODINGER 'S EQ UATION AS A PASSIVE EQ UATION. 
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8.5 Schrodinger's equation as a passive equa- 
tion. 

In quantum mechanics, the background is a complex Hilbert space. In order to 
avoid technicalities, let us assume that H is a finite dimensional complex vector 
space with an Hermitian scalar product. Let M denote space of all self adjoint 
operators on H. Let Q be the group of all unitary operators, and let Q act on 
M. by conjugation: U G G acts on M. by sending 

4h UAU- 1 . 

Since M is a vector space, its tangent bundle is automatically trivialized. We 
may also identify the space of linear functions on M. with M by assigning to 
BgM the linear function l B defined by 

e B (A) =trAB. 

If C is a self adjoint matrix, the tangent to the curve 

exp(iiC) A exp(-itC) 

at t — is i[C, A]. So the "tangent space to the orbit through A" consists of all 
i[C,A] 

Show that the passive equation (8.18) becomes 

[A, B]=0 

for fi — £b- A linear function is called a pure state if it is of the form £b where 
B is projection onto a one dimensional subspace. This means that there is a 
unit vector <j> £ H (determined up to phase) so that 

Bu= (u,<j))<j> VueH 

where ( , ) denotes the scalar product on H. 

Show that a pure state satisfies (8.18) if and only if <j> is an eigenvector of 

H: 

H(t> = \(t> 

for some real number A. This is the (time independent) Schrodinger equation. 

8.6 Harmonic maps. 

Let us return to equation (8.18) in the setting of the group of diffcomorphisms 
of compact support of a manifold M acting on the scmi-Ricmannian metrics. In 
the case that we our linear function \x was given by a "delta function tensor field 
supported along a curve" we saw that condition (8.18) implies that the curve 7 
is a geodesic and the tensor field is ±7' ® 7' (under suitable reprametrization of 
the curve and assuming that the tensor field does not vanish anywhere on the 
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curve). We now examine what condition (8.18) says for a "delta function tensor 
field" on a more general submanifold. So we are interested in the condition 

/i(S(W I) = 

for all V of compact support where \x is provided by the following data: 

1. A A; dimensional manifold Q and a proper map / : Q — > M, 

2. A smooth section t of f$S 2 (TM), so t assigns to each q g Q an element 
t(q) g S 2 TM f(q) , and 

3. A density w on Q. 

For any section s of S 2 T*M and any g g Q we can form the "double contraction" 
s(g) • t(q) since s(g) and t(<?) take values in dual vector spaces, and since / s 
proper, if s has compact support then so does the function q i— > s(q) • t(q) on 
Q. We can then form the integral 

»[s] := / a(.).t(-)w. (8.21) 

We observe (and this will be important in what follows) that /i depends on the 
tensor product t ®u as a section of f$S 2 TM ® D where D denotes the line 
bundle of densities of Q rather than on the individual factors. 

We apply the equation /x(<S(W j) = to this fi and to v = 4>W where </> is 
a function of compact support and W a vector field of compact support on M. 
Since 

v{4>w) = d4><)t)W + <t>vw 

and t is symmetric, this becomes 

/ t • (d<f> ® W i +<jNW i)u = 0. (8.22) 

JQ 

We first apply this to a ^ which vanishes on f(Q), so that the term 4>VW 
vanishes when restricted to Q. We conclude that the "single contraction" t • 9 
must be tangent to f(Q) at all points for all linear differential forms 9 and hence 
that 

t = df^h 

for some section h of S 2 (TQ). 

Again, let us apply condition (8.22), but no longer assume that <ft vanishes 
on f(Q). For any vector field Z on Q let us, by abuse of language, write 

Z<t> for Zf*(j), 

for any function 4> on M, write 



(Z,W) for {df*Z,W) M 
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so 



Also, 



where W is a vector field on M, and 

V Z W for V df , z W. 

Write 

h = ^2 h l °eie 
in terms of a local frame field ei,...,et on Q. Then 

t • (W |) = hij K(^)( e ^ W ) + 0<V e< e,)] . 

Now 

(V e4 W ) e J -) = e i <W,e J -)-<W;Ve 1 e J -) 
t . W |= ^ [/^(^e,-, VK» - <£(W, /i tf V e4 ej)] . 

/ Y^h*e i {4>{e j ,W))w = - f ^e h W)L^ ihijei u;. 
JQ JQ 

Let us write _ 
so 

If we set 

then condition (8.22) becomes 

^/i^V^- = -Z, (8.23) 

where we have used M V to emphasize that we are using the covariant derivative 
with respect to the Levi-Civita connection on M, i.e. 

M V e ^ := Vf.eWj). 

To understand (8.23) suppose that we assume that h is non-degenerate, and 
so induces a semi-Riemannian metric h on Q, and let us assume that lo is the 
volume form associated with h. (In all dimensions except k = 2 this second 
assumption is harmless, since we can rescale h to arrange it to be true.) Let 
h V denote covariant differential with respect to h. Let us choose the frame field 
ei, . . . , efc to be "orthonormal" with respect to h, i.e. 

h'' 3 = ejSij, where Cj = ±1 

so that 

^h tJ e t = eje-j. 
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Then 

L Cj uj = C( h Vej)cj 

and 

C( h Ve,) = Y,^e i e j ,e i ) il = - e 4 h V ei e 4 ) fi , 

SO 

j i i ij 

Given a metric h on Q, a metric g on M, the second fundamental form 
of a map / : Q — > M, is defined as 

fl/pr.y) := g v d/(x) (d/(r)) - d/( h v x r). (8.24) 

Here X and Y are vector fields on Q and df(X) denotes the "vector field along/" 
which assigns to each q G Q the vector df q (X q ) g TMfr q \. 

The tension field t(/) of the map / (relative to a given g and h) is the 
trace of the second fundamental form so 

r(f) = £ ^ g V d/(ei) (rf/(e,)) - d/( h V esej ) 

in terms of local frame field. 

A map / such that r(/) = is called harmonic. We thus see that under 
the above assumptions about h and w 

Theorem 2 Condition (8.22) says that f is harmonic relative to g and h. 

Suppose that we make the further assumption that h is the metric induced 
from g by the map /. Then 

df( h W x Y) = (^ df{x) df(Y)) tan , 

the tangential component of g V df(x)df{Y) and hence 

B f (X,Y) = (sV df(x) df(Y))™, 

the normal component of S V df(X)df(Y). This is just the classical second funda- 
mental form vector of Q regarded as an immersed submanifold of M. Taking its 
trace gives kH where H is the mean curvature vector of the immersion. Thus if 
in addition to the above assumptions we make the assumption that the metric 
h is induced by the map /, then we conclude that (8.18) says that H = 0, i.e. 
that the immersion / must be a minimal immersion. 



Chapter 9 

Submersions. 



The treatment here is that of a 1966 paper by O'Neill (Michigan Journal of 
Math.) following earlier basic work by Hermann. In a sense, the subject can 
be regarded as the appropriate generalization of the notion of a "surface of 
revolution" 

9.1 Submersions. 

Let M and B be differentiable manifolds, and tt : M — > B be a submersion, 
which means that dir m : TM m — > TB^^m) is surjective for all m € M. The 
implicit function theorem then guarantees that 7r -1 (6) is a submanifold of M 
for all b £ B. These submanifolds are called the fibers of the submersion. By 
the implicit function theorem, the tangent space to the fiber through m £ M 
just the kernel of the differential of the projection, n. Call this space V(M) m . 
So 

V(M) m :=ker dTT m . 

The set of such tangent vectors at m is called the set of vertical vectors, and 
a vector field on M whose values at every point are vertical will be called a 
vertical vector field. We will denote the set of vertical vector fields by V(M). 

If <j> is a smooth function on B, and V is a vertical vector field, then Vir*(f> = 
0. Conversely, if Vtt*4> — for all smooth functions, <j> on B, then V is vertical. 
In particular, if U and V are vertical vector fields, then so is [U, V]. 

Now suppose that both M and B are (semi-)Riemann manifolds. Let 

H{M) m := V(M)i. 

We assume the following: 

dir m : H(M) m — > TB 7r(m) 

is an isometric isomorphism, i.e. is bijective and preserves the scalar product of 
tangent vectors. Notice that this implies that V(M) m n H{M) m = {0} so that 
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the restriction of the scalar product to V(M) m is non-singular. (Of course in 
the Ricmannian case this is automatic.) 

We let H : T(M) m — ► H(M) m denote the orthogonal projection at each 
point and also let TL(M) denote the set of "horizontal" vector fields (vector fields 
which belong to H(M) m at each point). Similarly, we let V denote orthogonal 
projection onto V(M) m at each point. So if E is a vector field on M, then VE 
is a vertical vector field and HE is a horizontal vector field. We will reserve the 
letters U, V, W for vertical vector fields, and A, Y, Z for horizontal vector fields. 

Among the horizontal vector fields, there is a subclass, the basic vector fields. 
They are defined as follows: Let Xb be a vector field on B. If m £ M, there 
is a unique tangent vector, call it X(m) £ H{M) m such that dn m X(m) = 
Xs(7r(m)). This defines the the basic vector field, X, corresponding to Xb- 
Notice that if X is the basic vector field corresponding to Xb, and if <j> is a 
smooth function on B, then 

Xn*<j> = -k*(X b 4>). 

Also, by definition, 

(A, Y)m = 7t*(X b ,Y b )b 

for basic vector fields A and Y. In general, if A and Y are horizontal, or even 
basic vector fields, their Lie bracket, [X, Y] need not be horizontal. But if X 
and Y are basic, then we can compute the horizontal component of [X, Y] as 
follows: If 4> is any function on B and if X and Y are basic vector fields, then 

(H[X,Y])n*<f> = [X,Y]ttV 

= XYir*<f> — YXtt*4> 

= 7V*{X B YB<f>-Y B XB<f>) 

= 7r*([A B ,Y B ]0) 

so H[X, Y] is the basic vector field corresponding to [Xb, Yb]. 
We claim that 

H(VxY) is the basic vector field corresponding to (Yb) (9.1) 

where V s denotes the Levi-Civita covariant derivative on B and V denotes the 
covariant derivative on M. Indeed, let Xb,Yb, Zb be vector fields on B and 
A, Y, Z the corresponding basic vector fields on M. Then 

A ( Y, Z) m = X(7t*(Xb,Yb)b)=7t* (X b (Y b ,Z b )b) 

while 

(X, [Y,Z]) = (X,H[Y,Z\) = n* ((X B , [Y b ,Z b })b) 

since Tt[Y, Z] is the basic vector field corresponding to [Yb,Zb]- From the 
Koszul formula it then follows that 

(W x Y,Z)m = 7t*(W^ b Yb,Z b )b. 
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Therefore dw m (Vx Y(m)) = Vjf B Yb(t(t7i)) for all points m which implies (9.1). 

Suppose that 7 is a horizontal geodesic, so that irj is a regular curve, so 
an integral curve of a vector field Xb on B. Let X be the corresponding basic 
vector field, so 7 is an integral curve of X. The fact that 7 is a geodesic implies 
that ^ xX — along 7, and hence by (9.1) Vx b ^b — along ttj so ir-f is a 
geodesic. We have proved 

7r(7) is a geodesic if 7 is a horizontal geodesic. (9-2) 

If V and W are vertical vector fields, then we may consider their restriction 
to each fiber as a vector field along that fiber, and may also consider the Levi- 
Civita connection on the fiber considered as a scmi-Ricmann manifold in its own 
right. We will denote the covariant derivative of W with respect to V relative 
to the connection induced by the metric on each fiber by VyT'F. It follows from 
the Koszul formula, and the fact that [V, W] is vertical if V and W are that 

V^W = V{V V W) (9.3) 

for vertical vector fields. Here V is the Levi-Civita covariant derivative on M, 
so that Vyjy has both a horizontal and a vertical component. 

9.2 The fundamental tensors of a submersion. 

9.2.1 The tensor T. 

For arbitrary vector fields E and F on M define 

T E F := H[V VE {VF)] + V[V VE (HF)}, 

where, in this equation, V denotes the Levi-Civita covariant derivative deter- 
mined by the metric on M. 

If f is any diffcrentiable function on M, then VfF — fVF and Vvb(JVF) = 
[(VE)f]VF+ fV VE (VF) so 

H[V VE (V(fF))] = fH[V VE (VF)]. 

Similarly / pulls out of the second term in the definition of T. Also V(JE) = 
fVE and V/vb = /Vvb by a defining property of V. 

This proves that T is a tensor of type (1,2): T fE F = T E (fF) = fT E F. 

By definition, T E = Tys depends only on the vertical component, VE of E. 
If U and V are vertical vector fields, then 

T V V = HVuV 

= HX7 V U + H{[U,V]) 
= H\7 V U 



since [U, V] is vertical. Thus 

TuV = T v U (9.4) 
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for vertical vector fields. Also notice that if U is a vertical vector field then TeU 
is horizontal, while if X is a horizontal vector field, then TeX is vertical. 

1. Show that 

(T E F U F 2 ) = -(F U T E F 2 ) (9.5) 
for any pair of vector fields F\ , F 2 . 

9.2.2 The tensor A. 

This is defined by interchanging the role of horizontal and vertical in T, so 

A E F := VV H e{HF) + HV H e(VF). 

The same proof as above shows that A is a tensor, that A E sends horizontal 
vector fields into vertical vector fields and vice versa, and the your solution of 
problem 1 will also show that 

(A E F U F 2 ) = -(F U A E F 2 ) 

for any pair of vector fields F\,F 2 . 

Notice that the any horizontal vector field can be written (locally) as a 
function combination of basic vector fields, and if V is vertical and X basic, 
then 

[V,X]ir*(j> = Vn*(X B (j>) -XV<K*(f) = 0, 
so the Lie bracket of a vertical vector field and a basic vector field is vertical. 

2. Show that AxX = for any horizontal vector field, X, and hence that 

A X Y = -A Y X (9.6) 
for any pair of horizontal vector fields X, Y. Since 

V[X, Y] = V(V X F - Vyl) = A X Y - A Y X 
it then follows that 

A X Y = l -V[X,Y\. (9.7) 

(Hint, it suffices to show that A x X = for basic vector fields, and for this, that 
(V, A X X) = for all vertical vector fields since A X X is vertical. Use Koszul's 
formula.) 

We can express the relations between covariant derivatives of horizontal and 
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vertical vector fields and the tensors T and A: 

V V W = T V W + V^W (9.8) 

Vyl = 7i\7 v X + T v X (9.9) 

V X V = A X V + VV X V (9.10) 

V X Y = HWxY + AxY (9.11) 

If X is a basic vector field, then VyX = \7 X V + [V,X] and [V, X] is vertical. 
Hence 

HV V X = A X V if X is basic. (9.12) 

9.2.3 Covariant derivatives of T and A. 

The definition of covariant derivative of a tensor field gives 

(V Ei A)e 2 E 3 = V El {A E2 E 3 ) - A VeiE2 E 3 - A E2 (V El E 3 ) 

for any three vector fields Ei 7 E 2 ,E 3 . Suppose, in this equation, we take E\ = V 
and E 2 = W to be vertical, and E 3 = E to be a general vector field. Then 
Ae 2 — Aw = so the first and third terms on the right vanish. In the middle 
term we have 

Av v W = A-yiV v W = A Tv w 

so that we get 

{V V A) W - -A Tv w (9.13) 

If we take E\ = X to be horizontal and E% — W to be vertical, again only the 
middle term survives and we get 

(V X A) W = -A AxW . (9.14) 

Similarly, 

(V x T)y = -T AxY (9.15) 

(VyT)y = -T TvY . (9.16) 



3. Show that 

((VuA) x V,W) = (TuV,A x W)- (TuW,A x V) (9.17) 

((W E A) X Y,V) = -((W E A) Y X,V) (9.18) 

((W E T) V W,X) = {(V E T) W V,X) (9.19) 

where U, V, W are vertical, X, Y are horizontal and E is a general vector field. 



We also claim that 

S((WzA) x Y, V) = S(A X Y, T V Z) 



(9.20) 
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where V is vertical, X, Y, Z horizontal and S denotes cyclic sum over the hori- 
zontal vectors. 

Proof. This is a tensor equation, so we may assume that X, Y, Z are basic 
and that the corresponding vector fields Xb, Yb, Zb have all their Lie brackets 
vanish at b = w(m) where m is the point at which we want to check the equation. 
Thus all Lie brackets of X, Y, Z are vertical at m. We have \[X, Y] = AxY by 
(9.7), so 

l -[[X,Y],Z] = [A X Y,Z] = \7 AxY (Z)-X7 z (A x Y) 

and the cyclic sum of the leftmost side vanishes by the Jacobi identity. So 

S[W AxY (Z)} = S[V Z (A X Y)]. (9.21) 

Taking scalar product with the vertical vector V(m), we have (at the point m) 
by repeated use of (9.4) and (9.5) 

(V AxY (Z),V) = (T AxY (Z),V) 
= -(Z,T AxY (V)) 
= -{Z,T V {A X Y)) 
= (T V Z,A X Y) 

We record this fact for later use as 

(T AxY (Z),V) = (T V Z,A X Y). (9.22) 

Using (9.21) we obtain 

S{V Z {A X Y),V) =S(T V Z,A X Y). (9.23) 

Now 

(V Z (A X Y), V) - ((V Z A) X Y, V) = (A VzX (Y), V) + (A X (V Z Y),V) 

while 

Aw z x(Y) - -A Y (HV Z X) = -A Y (HV X Z) 

using (9.6) for the first equations and the fact that [X, Z] is vertical for the 
second equation. Taking scalar product with V gives 

-{A Y {HV X Z),V) = -{A Y {V X Z),V) 

since A Y U is horizontal for any vertical vector, and hence 

(A Y (VW X Z),V) =0. 

We thus obtain 

(V Z (A X Y),V) - ((W Z A) X Y,V) = (A X (V Z Y),V) - (A Y (V X Z),V). 

The cyclic sum of the right hand side vanishes. So, taking cyclic sum and 
applying (9.23) establishes (9.20). 
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9.2.4 The fundamental tensors for a warped product. 

A very special case of a semi-Ricmann submersion is that of a "warped product" 
following O'Neill's terminology. Here M = B x F as a manifold, so tt is just 
projection onto the first factor. We are given a positive function, / on B and 
metrics ( , )b and ( , ) F on each factor. At each point m = (b,q), b € B,q € F 
we have the direct sum decomposition 

TM m = TB b TF q 

as vector spaces, and the warped product metric is defined as the direct sum 

(,) = (, )b©/ 2 (, )f. 

O'Neill writes M — B x j F for the warped product, the metrics on B and F 
being understood. The notion of warped product can itself be considered as a 
generalization of a surface of revolution, where B is a plane curve not meeting 
the axis of revolution, where / is the distance to the axis, and where F — S 1 , 
the unit circle with its standard metric. 

On a warped product, the basic vector fields are just the vector fields of B 
considered as vector fields of B x F in the obvious way, having no F component. 
In particular, the Lie bracket of two basic vector fields, X and Y on M is just the 
Lie bracket of the corresponding vector fields Xb and Yb on B, considered as 
a vector field on M via the direct product. In particular, [X, Y] has no vertical 
component, so AxY = 0. In fact, we can be more precise. For each fixed q € F, 
the projection w restricted to B x {q} is an isometry of B x {q} with B. Thus 

V xY = the basic vector field corresponding to V Xb Yb- 

On a warped product, there is a special class of vertical vector fields, those 
that are vector fields on F considered as vector fields on B x F via the direct 
product decomposition. Let us denote the collection of these vector fields by 
C(F), the "lifts" of vector fields on F to use O'Neill's terminology. If V e C{F) 
and X is a basic vector field, then [X, V] = since they "depend on different 
variables" and hence \7 X V = \7 V X. The vector field \?xV is vertical, since 
(VxV, Y) = —(V, V xY) = for any basic vector field, Y, as V x Y is horizontal. 
This shows that A x V = as well, so A = 0. We claim that once again we can 
be more precise: 

V X V = V V X = -j-V V basic X, and W e C(F). (9.24) 

Indeed, the only term that survives in the Koszul formula for 2(\7 X V 7 W), W G 
C(F) is X(V,W). We have 

(V,W) =f(V F ,W F ) F 

where we have written / instead of ir* f by the usual abuse of language for a 
direct product. Now {V Fl W F ) F is a function on F (pulled back to B x F) and 
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so is annihilated by X. Hence 



X(V,W) = 2f(Xf)(V F ,W F ) F 



2Xf 



(V,W), 



f 



proving (9.24). Notice that (9.24) gives us a piece of T, namely 




We can also derive the "horizontal' 



piece of T, namely 



T V W 



f 



grad /. 



(9.25) 



Indeed 



~{wy v x) 

-~^(V,W) and 
(grad/,X). 



Xf 



In this formula, it doesn't matter whether we consider / as a function on M 
and compute its gradient there, or think of / as a function on B and compute 
its gradient relative to B and then take the horizontal lift. The answer is 
the same since / has no F dependence. Finally, the vertical component of 
VyW, V, W € C{F) is just the same as the extension to M of Vy F Wjr since 
the metric on each fiber differs from that of F by a constant factor, which has 
no influence on the covariant derivative. 



We want equations relating the curvature of the base and the curvature of the 
fibers to the curvature of M and the tensors T, A, and their covariant derivatives. 
So we will be considering expressions of the form 



where R is the curvature of M and the E's are either horizontal or vertical. We 
let n = 0,1,2,3, or 4 denote the number of horizontal vectors, the remaining 
being vertical. This gives five cases. So we will get five equations for curvature. 
For example, n = corresponds to all vectors vertical, so we are asking for 
the relation between the curvature of the fiber and the full curvature. Let R v 
denote the curvature tensor of the fiber (as a scmi-Ricmann submanifold) . 
The case n = is the Gauss equation of each fiber: 



9.3 Curvature. 



(Re 1 e 2 E3, E4) 



(R UV W,F) = 
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(RfrvW, F) - (T V W, T V F) + (T V W, T V F), U, V,W 7 F e V(M). (9.26) 

We recall the proof (O'Neill 100). We may assume [U, V] — so 

Ruv = -V(/V V + V V V[, 

and, using (9.3) and the definition of T, if we have 

(Vc/Vy W, F) = (VVc/V^W, F) + (\7 U (T V W), F) 

= (V^V^W, F) + U(T V W, F) - (T V W, V V F) 
= (V^W,F)-(T v W,TuF). 

Substituting the above expression for Rjjv into (RjjvW, F) then proves (9.26). 

The case n = 1 is the Codazzi equation for each fiber: Let U, V, W be 
vertical vector fields and X a horizontal vector field. Then 



(R UV W, X) = ({V v T)uW, X) - ({VuT) v W, X) (9.27) 

This is also in O'Neill, page 115. We recall the proof. We assume that 
[U, V] = so R uv = -Vc/Vy + Vy V v as before. We have 

(VuV v W,X) = (V U V V W,X) + (V U (T V W),X) 

= (Tu(VXrW),X) + (Vu(T v W),X). 

We write 

Vu(T v W) = (VuT)vW + T VuV W + T V V V W 

and 

(TyVuW.X) = (T V V^W,X) 

so 

(WuW v W,X) = ((V u T) v W,X)+(T u (V v W),X)+(T v (V u W),X)+(T VuV W,X) 

Interchanging U and V and subtracting, using VuV — \7 V U proves (9.27). 

We now turn to the opposite extreme, n — 4 and n = 3 but first some 
notation. We let R n denote the horizontal lift of the curvature tensor of B: 
If hi G H(M) m with Vi := d-K m hi define R% lh2 h 3 to be the unique horizontal 
vector such that 

d^m (Rh lha ha) = Rv lV2 V3- 
The case n = 4 is given by 
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(RxyZ, H) = (R% Y Z, H) - 2(A X Y, A Z H) + (A Y Z, A X H) + (A Z X, A Y H) 

(9.28) 

for any four horizontal vector fields A, Y,Z,H. As usual, we may assume X, Y, Z 
are basic and all their brackets are vertical. We will massage each term on the 
right of 

RxyZ = V [X , Y] Z - V X V Y Z + V Y V X Z. 
Since [X, Y] is vertical, [X, Y] = 2A X Y. So 

V [X}Y] Z = 2HV AxY Z + 2T AxY (Z). 

Since Z is basic we can apply (9.12) to the first term giving 

V [X . Y] Z = 2A Z {A X Y) + 2T AxY (Z). 

Let us write 

S7 Y Z = nv Y z + A Y Z 

and apply equation (9.1) which we write, by abuse of language as 

HV Y Z = V Y Z. 

Then 

VxVyZ = Vf \7 Y Z + A X (\7§Z) + A X A Y Z + V\7 X (A Y Z). 

Separating the horizontal and vertical components in the definition of R gives 

HR XY Z = -[V X ,V Y ]Z + 2A Z A X Y-A X A Y Z + A Y A X Z (9.29) 
VR XY Z = 2T AxY (Z)-VV x (A Y Z) + 

+VV Y (A X Z)-A X (V Y Z) + A Y (V X Z) (9.30) 

As we have chosen X, Y such that [Ab,Yb] = 0, the first term on the right 
of (9.29) is just R XY Z. Taking the scalar product of (9.29) with a horizontal 
vector field (and using the fact that A E is skew adjoint relative to the metric 
and A X Z = -A Z X) proves (9.28). 

If we take the scalar product of (9.30) with a vertical vector field, V we 
get an expression for (R XY Z, V) (and we can drop the projections V). Let us 
examine what happens when we take the scalar product of the various terms on 
the right of (9.30) with V. The first term gives 

(T AxY (Z),V) = {T V Z,A X Y) 

by (9.22). The next two terms give 

(V Y (A X Z),V) - (V X (A Y Z), V) - ((V Y A) X Z, V) - {{V X A) Y Z, V) 

+(A X (V Y Z),V) - (A Y (V X Z),V) 
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since Vx^ — VyX = [X, Y] is vertical by assumption. The last two terms 
cancel the terms obtained by taking the scalar product of the last two terms in 
(9.30) with V and we obtain 



(RxyZ, V) = 2(A X Y, T V Z) + ((V Y A) X Z, V) - ((\7 X A) Y Z, V). (9.31) 

We can simplify this a bit using (9.18) and (9.20). Indeed, by (9.18) we can 
replace the second term on the right by — ((Vx^)yZ, V) and then apply (9.20) 
to get, for n = 3, 



(RxyZ,V) = ((X7 Z A) X Y,V) + (A X Y,T Z V) - (A Y Z,T V X) - (A Z X,T V Y). 

(9.32) 

Finally we give an expression for the case n = 2: 



(R xv Y,W) = ((\7 X T) V W,Y) + ((\7 V A) X Y,W)-(T V X,T W Y) + (A X V 7 A Y W). 

(9.33) 

To prove this, write 

Rxv = Vv x F — Vv v X — VjscVy + VyVx 

and 

(W VxV Y,W) = -(Y,T VxV W) + (A VxV Y,W) 
-(V VvX Y,W) = -(T VvX Y,W) - (A VvX Y,W) 
-(V X V V Y,W) = -(V X (T V Y),W) + (V V Y,A X W) 
(W V W X Y,W) = (W V A X Y,W)- (W X Y,T V W) 

where, for example, in the last equation we have written \7 X Y = A X Y +HV X Y 
and 

{V V HV X Y,W) = -(HV X Y,V V W) = (HV X Y,T V W) = (V X Y,T V W). 
We have 

((V X T) V W,Y) = -{W,{V X T) V Y) 

= -(W,V X (T V Y)) + (W,T VxV Y) + (W,T V V X Y) 
((V V A) X Y,W) = (V v (A x Y),W}~(A VvX Y,W}-(A x VvY,W). 
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The six terms on the right of the last two equations equal six of the eight terms 
on the right of the preceding four leaving two remaining terms, 

(A VxV Y,W)-(T VvX Y,W). 

But 



(A VxV Y,W) = -(A Y H\7 X V,W) 
= -(A Y A X V,W) 
= (A X V,A Y W) 

and a similar argument deals with the second term. 

We repeat our equations. In terms of increasing values on n we have 

(RuvW,F) = (RVyW,F)~(T u W,T v F) + (T v W,T u F), 

(RuvW,X) = {{V v T)uW,X)-((V u T) v W,X) 

(R XV Y,W) = ((W X T) V W,Y) + ((W V A) X ,W)-(T V X,T W Y) + (A X V,A Y W), 

(R XY Z, V) = {(V Z A) X Y, V) + (A X Y, T Z V) - (A Y Z, T V X) - (A Z X, T V Y) , 

(R XY Z,H) = (R% Y Z,H)-2(A X Y,A Z H) + (A Y Z,A X H) + (A Z X,A Y H). 

We have stated the formula for n = 2, i.e. two vertical and two horizontal 
fields for the case (R X yY, W), i.e. where one horizontal and one vertical vector 
occur in the subscript RE t E 2 - But it is easy to check that all other arrangements 
of two horizontal and two vertical fields can be reduced to this one by curvature 
identities. Similarly for n = 1 and n = 3. 



9.3.1 Curvature for warped products. 

The curvature formulas simplify considerably in the case of a warped product 
where A = and 

TyX = X j-V, T V W = - ^^grad/. 

We will give the formulas where X, Y, Z 1 H are basic and U, V, W, F S £(F). We 
have Vf = and (Vygrad f,X) = VXf - (grad /, V V X) = 0. We conclude 
that the right hand side of (9.27) vanishes, so R\jyW is vertical and we conclude 
from (9.26) that 

R UV W = R(j V W - (grad f f f &d f) ((U, W)V - (V, W)U) (9.34) 

The Hessian of a function / on a scmi-Riemann manifold is defined to be 
the bilinear form on the tangent space at each point defined by 



Hf(X,Y) = (V x giadf,Y). 
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In fact, we have 

(Vxgrad/,y) = XYf - (pad f,V x Y) 
= V x (df(Y))-df(V x Y) 

= [vv/](x,y) 

which gives an alternative definition of the Hessian as 

H f = VV/ 

and shows that it is indeed a (0,2) type tensor field. Also 

Hf{X,Y) = XYf-(V x Y)f 

= [X,Y]f + YXf- [V X Y -V Y X + V Y X]f 

= YXF-(V Y X)f 

= Hf(Y,X) 

showing that is a symmetric tensor field. 
We have 

V x V= X jtv = T v X, T v W = - {V ^ g ™df, 
if X is basic and V, W e C{F). So 



(VxT) v W = V x (T v W)-T VxV -T v (VxW) 

^grad / + ^V x grad + 2{V, W) ^ 



and (grad /, Y) = Yf. Therefore the case n = 2 above yields 

(RxvY,W) = - Hf{ *' Y) (V,W). (9.35) 

The case n = 3 gives 

(RxyZ, V) = 

and hence by a symmetry property of the curvature tensor, (Rxy Z,V) = 
{RzvX, Y) = 0, or, changing notation, 

(RxvY, Z) = 0. 

Thus 

(9.36) 

We have (R UV X,W) = -(R UV X,W) = and by (9.36) and the first Bianchi 
identity (R UV X,Y) = (Hf(X,Y)/f) x {(U,V) - (V,U)) = so 

R UV X = 0. (9.37) 



192 



CHAPTER 9. SUBMERSIONS. 



If we use this fact, the symmetry of the curvature tensor and (9.35) we see that 

RxvW = ^^Vxgrad /. (9.38) 

It follows from the case n = 3 and n = 4 that 

RxyZ = R^yZ, (9.39) 

the basic vector field corresponding to the vector field R x Y Zb- Hence (RxyV, Z) = 
0. We also have (R XY V, W) = (R V wX, Y) = so 

RxyV = 0. (9.40) 



Ricci curvature of a warped product. 

Recall that the Ricci curvature, Ric (X,Y) defined as the trace of the map 
V i ► RxvY is given in terms of an "orthonormal" frame field E\, . . . , E n by 

Ric (X,Y) = Y,ti(RxE i Y,E i ), e t - {E t ,E t ). 

We will apply this to a frame field whose first dim B vectors lie in Vect B and 
whose last d = dim F vectors lie in Vect F. We will assume that d > 1 and 
that X,Y e Vect B and U, V € Vect F. We get 

Ric {X, Y) = Ric B (X, Y) - ^Hcss B {f){X, Y) (9.41) 

Ric (X, V) = (9.42) 
Ric (V, W) = Ric F (V, W) — (V, W)f& where (9.43) 

f* = ^ + (rf-l) (grad/ / f ad/) (9.44) 

where A/ is the Laplacian of / which is the same as the contraction of the 
Hessian of /. 



Geodesies for a warped product 

We now compute the equations for a geodesic on B XfF. Let ■y(s) = (a(s),/3(s)) 
be a curve on B x f F and suppose temporarily the neither a'(s) = nor 
0'(s) = in an interval we are studying. So we can embed the tangent vectors 
along both projected curves in vector fields, X on B and V on F, so that 7 is a 
solution curve to X + V on B Xf F. The condition that 7 be a geodesic is then 
that V x +v(X + V) = along 7. But 

V x +v{X + V) = v x x + v x v + v v x + v v v 

= V fV + 2^y-^-^grad/ + v£K 
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Separating the vertical and horizontal components, and using the fact that 
Vj^A = a" along a and j3' = V, VyV — [3" along (3 shows that the geodesic 
equations take the form 

a" = (/?', /3') F (f o a)grad / on B (9.45) 

P" = -^^ onf (9 .4 6 ) 
j o a ds 

A limiting argument [O-208] shows that these equations hold for all geodesies. 

We repeat all the important equations of this subsection: 
V X Y = vfr 

X7 X V = ^-V = \7 V X 

nv v w = t v v 



■}<V,W)grad/ 



vert V V W 



geodesic eqns 



(/3',/3') F (/oa)grad/onB 

2 dtfoa 
/oa ds 



curvature 

RxyZ = R-xyZ 



RvxY 



Hess B (f)(x,Y) v 
RxvW = <™V x gradf 



R UV W = Rl v W - <grad f jf XBA f) {{U, W)V - {V, W)U) 
Ricci curv 

Ric (X, Y) = Ric B (A, Y) - |Hess B (/) (A, Y) ) 

Ric (A, V) = 

Ric (V, W) = Ric F (V, W) - (V, W)f* where 

f* := M + (d _i ) <grad/,grad/) 

9.3.2 Sectional curvature. 

We return to the general case of a submersion, and recall that the sectional 
curvature of the plane, P a b C TM m , spanned by two independent vectors, a, b e 
TM m is defined as 

(i? ob a, 6) 



(a,a)(b,b) - (a, 6) 



2 ' 
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We can write the denominator more simply as \\a A b\\ 2 . 

In the following formulas, all pairs of vectors are assumed to be independent, 
with u,v vertical and x, y horizontal, and where xb denotes dir m (x) and xjb ■= 
dir m (y)- Substituting into our formulas for the curvature gives 



K(P VW ) = K^P vw )- ^ T : W l~^ M? (9-47) 

\v A w \' z 



K{Pxv) ww 



K[, -' ] = K '' l^t' ,!U!,) 



9.4 Reductive homogeneous spaces. 
9.4.1 Bi-invariant metrics on a Lie group. 

Let G be a Lie group with Lie algebra, g, which we identify with the left in- 
variant vector fields on G. Any non-degenerate scalar product, ( , ), on g thus 
determines (and is equivalent to) a left invariant semi-Ricmann metric on G. 
We let A a denote conjugation by the element a E G, so 

A a :G^G,A a (b)=aba- 1 . 

We have A a (e) = e and 

d(A a ) = Ad a ■ TG e — > TG e . 

Since A a = L a o R a -i, the left invariant metric, ( , ) is right invariant if and 
only if it is A a invariant for all a e G, which is the same as saying that ( , ) is 
invariant under the adjoint representation of G on g, i.e. that 

(Ad a Y, Ad a Z) - (Y, Z) , VY,Ze g, a e G. 

Setting a = exptX, X e g, differentiating with respect to t and setting t = 
gives 

([X,Y],Z) + (Y,[X,Z}) = 0, VX,Y,Zeg. (9.50) 

If G is connected, this condition implies that ( , ) is invariant under Ad and 
hence is invariant under tight and left multiplication. Such a metric is called 
hi- invariant. 

Let inv denote the map sending every element into its inverse: 
inv : a aT 1 , a G G. 
Since inv exp tX = exp(— tX) we see that 



d inv e = —id 
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Also 

inv =i? a -ioinv o L a -i 
since the right hand side sends b € G into 

b i ► a~ 1 b i ► b~ 1 a 

Hence d inv a : TG a — > TG a -i is given, by the chain rule, as 

dR a ~i o G?inv e o dL a -i — —dR a -i o dL a -i 

implying that a bi-invariant metric is invariant under the map inv. Conversely, 
if a left invariant metric is invariant under inv then it is also right invariant, 
hence bi-invariant since 

R a = inv o Lq 1 o inv . 

The Koszul formula simplifies considerably when applied to left invariant 
vector fields and bi-invariant metrics since all scalar products are constant, so 
their derivatives vanish, and we are left with 

2(v x r, z) = -(x, [y, z\) - (y, [x, z\) + (z, [x, y]) 

and the first two terms cancel by (9.50). We are left with 

V x Y= l -[X,Y\. (9.51) 

Conversely, if ( , ) is a left invariant bracket for which (9.51) holds, then 

(X,[Y,Z]) = 2(X,\7 Y Z) 
= -2{V Y X,Z) 
= -([Y,X],Z) 
= ([X,Y],Z) 

so the metric is bi-invariant. 

Let a be an integral curve of the left invariant vector field X. Condition 
(9.51) implies that a" = Vjl = so a is a geodesic. Thus the one-parameter 
groups are the geodesies through the identity, and all geodesies are left cosets 
of one parameter groups. (This is the reason for the name exponential map in 
Riemannian geometry. ) 

We compute the curvature of a bi-invariant metric by applying the definition 
to left invariant vector fields: 

r xy z = \[[x,Y], z] \[x, [y, z\\ + J[y, [x, z\\. 

Jacobi's identity implies the last two terms add up to — \[[X, Y],Z] and so 

R XY Z=^[[X,Y],Z]. (9.52) 
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In particular 

(RxyX,Y) = \{[[X,Y],X],Y) = \{[X,Y],[X,Y]) 

so 

K[X ' Y) -A\\XAY\\f (9 - 53) 

For each X e g the linear transformation of g consisting of bracketing on 
the left by X is called ad X. So 

aAX:g~>g, ad X(V) := [X, V]. 

We can thus write our formula for the curvature as 

RxvY = -*(ad Y)(ad X)V. 

Now the Ricci curvature was defined as 

Ric {X, Y) =tr[V^ RxvY]. 
We thus see that for any bi-invariant metric, the Ricci curvature is always given 

by 

Ric = ~B (9.54) 

where B, the Killing form, is defined by 

B(X, Y) := tr (ad X){sA Y). (9.55) 

The Killing form is symmetric, since tr (AB) = tr BA for any pair of linear 
operators. It is also invariant. Indeed, let [i : g — * g be any automorphism of 
g, so fi([X,Y]) = [n(X) , n(Y)] for all X,Y £ g. We can read this equation as 
saying 

ad (ji(X))(ii(Y)) = MadpO(F)) 

or 

ad (fi(X)) — [i o ad X/j, -1 . 

Hence 

ad (n(X))ad {fi{Y)) =/ioad Xad Yfj,' 1 . 
Since trace is invariant under conjugation, it follows that 

B(v(X)^(Y)) = B(X,Y). 

Applied to fx = cxp(tad Z) and differentiating at t = shows that X], Y) + 

B(X,[Z,Y})=0. 

So the Killing form defines a bi-invariant scalar product on G. Of course it 
need not, in general, be non-degenerate. For example, if the group is commuta- 
tive, it vanishes identically. A group G is called semi-simple if its Killing form 
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is non-dcgcnerate. So on a semi-simple Lie group, we can always choose the 
Killing form as the bi-invariant metric. For such a choice, our formula above 
for the Ricci curvature then shows that the group manifold with this metric is 
Einstein, i.e. the Ricci curvature is a multiple of the scalar product. 

Suppose that the adjoint representation of G on g is irreducible. Then g can 
not have two invariant non-degenerate scalar products unless one is a multiple 
of the other. In this case, we can also conclude from our formula that the group 
manifold is Einstein. 



9.4.2 Homogeneous spaces. 

Now suppose that B = G/H where H is a subgroup with Lie algebra, h such that 
h has an H invariant complementary subspace, to C g. In fact, for simplicity, 
let us assume that g has a non-degenerate bi-invariant scalar product, whose 
restriction to h is non-degenerate, and let m — h 1 - . This defines a G invariant 
metric on B, and the projection — > G/H = B is a submersion. The left invariant 
horizontal vector fields are exactly the vector fields X e to, and so 

A X Y=^V[X,Y], X,Yem. 

On the other hand, the fibers are cosets of H, hence totally geodesic since the 
geodesies are one parameter subgroups. Hence T = 0. We can read (9.49) 
backwards to determine Kb(Pb) as 

K ( P ] - K(P ] , 3\\V[X,Y]\\ 2 



4 \\XAY\ 



or 



w , p , \\\H[X,YW+\\V[X,YW „ Vc (Q ,,, 

k b {p XbYb ) = - |pr Ar|| 2 ' x ' Yem - ( 9 - 56 ) 

See O'Neill pp. 313-15 for a slightly more general formulation of this result. 

It follows from (9.2) that the geodesies emanating from the point H e B = 
G/H are just the curves (exptX)H, Xem. 



9.4.3 Normal symmetric spaces. 

Formula (9.56) simplifies if all brackets of basic vector fields are vertical. So we 
assume that [to, m] c h. Then we get 

K IP \ H[*. y ]ll 2 ([[X,Y],X],Y) 

Kb(Px b y b ) = - • (9.57) 



For examples where this holds, we need to search for a Lie group G whose 
Lie algebra g has an Ad-invariant non-degenerate scalar product, ( , ) and a 
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decomposition g = h + m such that 



h 


_L 


m 


[h, h] 


C 


h 


[h, m] 


C 


TO 


to, ml 


C 


h. 



Let 9 : g — > g be the linear map determined by 

ex = -x, x e m, ©1/ = I/, c/ e ft. 

Then 

• is an isometry of ( , ) 

• F] = [6E, 6F] \/E, F eg 

• 9 2 =id. 

Conversely, suppose we start with a 9 satisfying these conditions. Since 9 2 = 
id, we can write g as the linear direct sum of the +1 and —1 eigenspaces of 9, 
i.e. define h := {U\9{U) = U} and m := {X\6(X) = -X}. Since 6 preserves 
the scalar product, eigenspaces corresponding to different eigenvalues must be 
orthogonal, and the bracket conditions on h and to follow automatically from 
their definition. 

One way of finding such a 9 is to find a diffeomorphism a : G — > G such that 

• G has a bi-invariant metric which is also preserved by a, 

• a is an automorphism of G, i.e. <r(ab) = a(a)a(b), 
. a 2 = id. 

If we have such a a, then # := da e satisfies our requirements. Furthermore, the 
set of fixed points of a, 

F:={ae G\a(a) = a} 

is clearly a subgroup, which we could take as our subgroup, H. In fact, let Fq 
denote the connected component of the identity in F, and let H be any subgroup 
satisfying F C H C F. Then M — G/H satisfies all our requirements. Such 
a space is called a normal symmetric space. We construct a large collection of 
examples of such spaces in the next two subsections. 

9.4.4 Orthogonal groups. 

We begin by constructing an explicit model for the spaces R p ' 9 and the orthog- 
onal groups 0(p,q). We let • denote the standard Euclidean (positive definite) 
scalar product on R n . For any matrix, M, square or rectangular, we let t M de- 
note its transpose. For a given choice of (p, q) with p + q = n we let e denote the 
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diagonal matrix with +1 in the first p positions and —1 in the last q positions. 
Then 

(u, v) :— (eu) • v = u • (ev) 

is a scalar product on R" of type (p, q) . 

The condition that a matrix A belong to 0(p, q) is then that 

eAv • Aw = ev, w, Vu, w e R" 

which is the same as 

(* AeAv) • to = (ev) •w Vw, w € R™ 
which is the as the condition 

l AeAv = ev VueR" 

So l AeA = e or 

AeO(p,q) ^ t A = eA~ 1 e. (9.58) 

Now suppose that A = exptM, M E g := o(p,q). Then, since the ex- 
ponential of the transpose of a matrix is the transpose of its exponential, we 
have 

exps*M = eexp(-sM)e = exp(-seMe) 
since e _1 = e. Differentiation at s = gives 

*M = -eMe (9.59) 

as the condition for a matrix to belong to the Lie algebra o(p, q) . If we write M 
in "block" form 

a x 
V b 



M = 
then 



and the condition to belong to o(p, q) is that 

t a = -a, *6 = -b, y — t x 
so the most general matrix in o(p, q) has the form 

M=(£ t a^-a, t b=-b. (9.60) 

Consider the symmetric bilinear form X,Y tr XY, called the "trace 
form" . It is clearly invariant under conjugation, hence, restricted to X, Y both 
belonging to o(p, q) , it is an invariant bilinear form. Let us show that is non- 
degenerate. Indeed, suppose that 

v= ," •!' ). y- ( ' " 



l x b r {'yd 
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arc elements of o(p, q). Then 

tr XY = tr (ac + bd + 2x t y) . 

this shows that the subalgebra h := o(p) ®o(q) consisting of all "block diagonal" 
matrices is orthogonal to the subspace m consisting of all matrices with zero 
entries on the diagonal, i.e. of the form 

For matrices of the latter form, we have 

tr x*x = x\i 

ij 

and so is positive definite. On the other hand, since 'a = — a and *6 = —6 we 
have a tr a 2 = — J2ij a ij 1S negative definite, and similarly for b. Hence the 
restriction of the trace form to h is negative definite. 



9.4.5 Dual Grassmannians. 

Suppose we consider the space R p+q , the positive definite Euclidean space, with 
orthogonal group 0(p+q). Its Lie algebra consists of all anti-symmetric matrices 
of size p+q and the restriction of the trace form to o(p + q) is negative definite. 
So we can choose a positive definite invariant scalar product on g = o(p + q) by 
setting 

(X,Y) :=-^tr XY. 

Let e be as in the preceding subsection, so e is diagonal with p plus l's and q 
minus l's on the diagonal. Notice that e is itself an orthogonal transformation 
(for the positive definite scalar product on R p+9 ), and hence conjugation by e 
is an automorphism of 0(p + q) and also of SO(p + q) the subgroup consisting 
of orthogonal matrices with determinant one. 

Let us take G = SO(p + q) and a to be conjugation by e. So 

a b \ _ ( a —b 
c d J y — c d 

and hence the fixed point subgroup is F = S(0(p) x O(q)). We will take 
H = SO{p) x S){q). The subspace m consists of all matrices of the form 



and 



tr X 2 = -2tr *xx 
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so 

(X,X) =tr *xx, 

which was the reason for the \ in our definition of ( , ) . 

Our formula for the sectional curvature of a normal symmetric space shows 
that the sectional curvature of 

G Pt g 

is non-negative. The special case p = 1 the quotient space is the q— dimensional 
sphere, 

G hq = S" 

and the x occurring in the above formula is a column vector. Hence [X, Y] where 
Y corresponds to the column vector y is the operator = y® t x — x® l y^ o(q), 
and 

||[x,y]|| 2 H|XAr|| 2 , 

proving that the unit sphere has constant curvature +1. 

Next let G be the connected component of 0(p,q), as described in the pre- 
ceding subsection, and again take a to be conjugation by e. This time take 

(X,Y)= 1 -tr XY. 

The -1 eigenspace, m of a consists of all matrices of the form 

/ *x \ 
[x ) 

and the restriction of ( , ) to m is positive definite, while the restriction to 
H := SO(p) x SO(q) is negative definite. The corresponding symmetric space 
G/H is denoted by G* q . It has negative sectional curvature. In particular, the 
case p = 1 is hyperbolic space, and the same computation as above shows that 
it has constant sectional curvature equal to —1. This realizes hyperbolic space 
as the space of timclikc lines through the origin in a Lorentz space of one higher 
dimension. 

These two classes of symmetric spaces are dual in the following sense: Sup- 
pose that (h, to) and (h*,m*) are the Lie algebra data of symmetric spaces G/H 
and G* /H* . Suppose we have 

• a Lie algebra isomorphism I : h — > h* such that (£U, £V)* = -(U, V), \/U, V e 
h and 

• a linear isometry i : m — > m* which reverses the bracket: 



[iX, iY}* = -[X, Y] VX, Fern. 
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Then it is immediate from our formula for the sectional curvature that 

K*(iX,iY) = -K(X,Y) 

for any X,Y e m spanning a non-degenerate plane. We say that the symmetric 
spaces G/H and G* / H* arc in duality. 

In our case, H = SO(p) x SO(q) for both G Pj9 and G* p q so we take I =id. 
We define i by 

. / -*x \ ( *x \ 
i: {x )"{x )■ 

It is easy to check that these satisfy our axioms and so G p>9 and G* are dual. 
For example, the sphere and hyperbolic space are dual in this sense. 

9.5 Schwarzschild as a warped product. 

In the Schwarzschild model we define 

P := {(t,r)\r > 0,r ^ 2M} 

with metric 

-hdt 2 + \dr 2 , h = h(r) = l-—. 
h r 

Then construct the warped product 

Px r S 2 

where S 2 is the ordinary unit sphere with its standard positive definite metric, 
call it da 2 . So, following O'Neill's conventions, the total metric is of type (3, 1) 
(timelike = negative square length) given by 

-hdt 2 + \dr 2 + r 2 d<r 2 . 
h 

We write P = Pj U P/j where 

Pi = {{t,r)\r > 2M}, P n = {{t,r)\r < 2M} 

and 

iV = P 7 x r 5 2 , B = P n x r S 2 . 

N is called the Schwarzschild exterior and B is called the black hole. In the 
exterior, <9 t is timelike. In B, <9 t is spacelike and d r is timelike. 

In either, the vector fields d t ,d r are orthogonal and basic. So the base 
is a surface with orthogonal coordinates. To apply the formulas for warped 
products we need some preliminary computations on connections and curvature 
of surfaces with orthogonal coordinates. 
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9.5.1 Surfaces with orthogonal coordinates. 

We consider a surface with coordinates u, v and metric 

Edu 2 + Gdv 2 



and set 
and write 
where 
and 



ei := sgn E, e 2 := sgn G 
Edu 2 + Gdv 2 = e^O 1 ) 2 + e 2 (d 2 ) 2 
6 1 := edu, e := \J e\E, e > 0, 



9 2 := gdv, g := y/^G, g > 0. 
The dual orthonormal frame field is given by 

Fi = -d u , F2 = -d v . 
e 9 

The connection forms, u 2 and ujf = —eie 2 u) 2 arc defined by 
w\{X) = 9 1 {VxF 2 ), uj 2 {X) = 6 2 {V X F 1 ) 
for any vector field, X , and are determined by the Cartan equations 

de 1 + lo\ a e 2 = 0, de 2 + ^Ae' = o. 

The curvature form is then given by doj 2 . We find the connection forms by 
straightforward computation: 

dd 1 = e v dvAdu = -^-duA0 2 

d0 2 = g u duAdv = -^fdvAO 1 
where subscripts denote partial derivatives. Thus 

1 , 9u 7 

ui 2 = — du — eie 2 — dv 
9 e 

satisfies both structure equations (with u\ — —ei€ 2 uj\) and is uniquely deter- 
mined by them. We compute 

duil — \ — ] dv A du — ei£ 2 ( — ) duAdv 
\9Jv ^e/u 

B -) +e 1 e 2 m " 
9J V V e /„ 



du A dv. 
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This is the curvature form In general, the Riemann curvature is related to 
the connection form by 

RvwiFj) = -^2n)(v,w)Fi. 

In our case there is only one term in the sum and the sectional curvature, which 
equals the Gauss curvature is given by 



(Rf 1 f 2 Fi,F 2 ) 



= —(Rf 1 f 2 F2,Fi) 
= (n 1 2 (F 1 ,F 2 )F u F 1 ) 
= e 1 n 1 2 (F u F 2 ) 



eg 
1 

eg 



ei 



(■2 



(") " 

V e ) u _ 

(") 

V e J u 



So 



X = 

eg 



(9.61) 

is the formula for the curvature of a surface in terms of orthogonal coordinates. 



9.5.2 The Schwarzschild plane. 

In the case of the Schwarzschild plane, P, we have eg — 1 so e r /g — ee r — e\\E r , 
and the partial derivatives with respect to t vanish. The formula simplifies to 
K = \E rr or 



K = 



2M 



(9.62) 



The connection form in the Schwarzchild plane is given by 



i M , 

u; 2 = -^dt, 



M 



7 dt 



by the same computation since e\e2 = — 1. So 



- h^uj 2 1 (d t )F 2 
, i M" 
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Similarly, 

V dr d t = d r {h^)F 1 

, i M 

= h--^F x 

M x 

and 

V 9r d r = d r (h-*)F 2 
M x 

We will also need the Hessian of the function r. We have, by definition, 
H r (X,Y) = (V x (grad r),y). 



Now 
and 



grad r = hd r 



Va t hd r = h\7 dt d r 
—dt 

Vo r ft.9 r = h r d r + KS7 Q r d r 
= -wO r . 



Thus 



M 

H r =^(,}. (9.63) 



9.5.3 Covariant derivatives. 

We wish to apply the formulas for covariant derivatives in warped products to 
the basic vector fields, dt,d r and to vector fields V, W tangent to the sphere 
(considered as vector fields on N U B, the warped product. 

The covariant derivatives of basic vector fields are the lifts of the correspond- 
ing vector fields on the base, and so from the previous subsection we get 

V dt d t = ^0 r (9.64) 



V dt d r = V dr d t 
M 
r 2 h 



2 ,d t (9.65) 



V dr d r = -^rd r (9.66) 
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From the formula 

for a warped product we get, taking / = r, 

V dt V = V v d t = 0, (9.67) 

and 

V dr V = V v d r = 1 r V. (9.68) 
Applying the formula for TyW for a warped product gives 

T V W = -^{V,W)d r (9.69) 



since grad r — hd r . This is the horizontal component of VyW. The vertical 



component is just the lift of V V W, the covariant derivative on the sphere. 



9.5.4 Schwarzschild curvature. 

From formulas (9.39) and (9.40) for warped products and our formula (9.62) for 
the curvature in the Schwarzschild plane we get 

Rc%d r (dt) = (-2Mh/r 3 )d r (9.70) 
Rd r d t (dr) = (2M/r 3 h)d t (9.71) 
R dtd V = 0. (9.72) 

From (9.36) and (9.63) we obtain 

RxvY = -RvxY = -™{X,Y)V 

so 

R dtV {d t ) - (Mh/r 3 )V (9.73) 

Ra t v(d r ) - (9.74) 

R dr v{dt) = (9.75) 

R drV (d r ) = (M/hr 3 )V. (9.76) 

We apply (9.34) to compute Ruv- We have (grad ft., grad h) = h 2 (d r ,d r ) = h 
and the fiber over (t, r) is the sphere of radius r whose curvature is r~ 2 . we get 

R VW U = (2M/r 3 ) ((U,V)W- (U,W)V) (9.77) 
Rvw(d t ) = (9.78) 
Rvw(d r ) = (9.79) 
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To apply (9.38) we compute 

V 9t grad r = hV dt {d r ) 




V 9r {hd r ) = (2M/r 2 )d r + h\7 9r (d r ) 




so 

Rd t v{W) = R dtW (V) = (M/r 3 )(V,W)d t (9.80) 
Ra r v(W) = Re r w(V) = (M/r 3 )(V,W)d r . (9.81) 



We show that the Ricci curvature vanishes by applying our formulas for the 
Ricci curvature of a warped product, (9.41)-(9.43). For a surface, R\c(X,Y) = 
K(X,Y) and this is (2M/r 3 )(X,Y) for vectors in the Schwarzschild plane. On 
the other hand, d = 2,/ = r,H* = (M/r 2 )( , ). This shows that Ric (X, Y) = 
0. 

For vertical vectors, we have 

Ric F (t/,VF) = r - 2 (V,W) 

while 

Ar = C(Hess r ) 
= 2M/r 2 
(grad /, grad /) = h so 
f*=r 2 

showing that Ric(V, W) = 0. 



9.5.5 Cartan computation. 



We have used the techniques of warped product to compute the Schwarzschild 
connection and curvature. However, the Cartan method is more direct: 
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ds 2 = 


(9 ) 2 - 


- (o 1 ) 2 


- (9 2 ) 2 - 


9° = 


Vhdt, 


h := 


2M 
r 


e 1 = 


—=dr 
Vh 






9 2 = 


rd-d 






9* = 


rSd<t> 






s = 


sin$ 






c = 


cost? 






dVh = 


1 ( 

2Vh \ 


2M\ 


dr 



so 



d9° 
d9 x 
d9 2 

d9 3 



M 



>Vh 



dr Adt 



M 

^v% 



9° A9 1 



= 



-^9 2 K9- 



r 

Vh, 



A9 1 



C 



or 



where 



d9 — —oj A 9 



( 



\ 

9 1 

9 2 

V * 3 / 



( 





M gO 

r 2 Vh 








\ 




M gO 

r 2 y/h 





Vhn2 
r 


Vhn3 
r 


















{ 





Vhg3 


C n3 





/ 
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Now 



M 

r 2 Vh 

M e o 
r 2 Vh 

Vh 



9-i 

Sr 



M 



;dt SO 



2M 



'"AO 1 



e 2 = Vhdtf so 



M 
^VTi 



dr Adtf 



-^O 1 A 9 2 



' (Vh Sd^j 



dr A Sd<j) + Vh Cd6 A d<f> 



M 
r 2 Vh 

M , , CVh o , 

a e 3 + — -^e 2 a e 3 

d(Cd<p) = -Sd9 A # 

-Ke 2 A6 3 . 



This then £ 


;ives the curvature matrix in this frame 


as 








( ° 




-^°Afl 2 








2_M O A 1 





-^Afl 2 


f^Afl 1 


:= do; + 


6 Au> = 














-^°Atf 2 


^A0 2 





^2 A ^3 






V -f^°A0 3 


-fl^Atf 1 - 


2M 2 A g 3 





The curvature tensor is given in terms of ft as 

R vw (E j ) = Y l Sl)(v,w)E i 



or 



= ®-){E k ,Ee). 



Notice from the form of given above, that Rj ke = if j ^ k, I. Hence -R™^ = 
if j =^ I. Looking at the columns we see that ^ — ±(2/ — I — I) =0. Thus 
the Schwarzschild metric is Ricci fiat. 



9.5.6 Petrov type. 

The tensor R^ is obtained from the tensor R a bcd = fl%(E c , Ed) by "raising " 
the second index. We want to consider this as the matrix of the operator [R] 
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relative to the basis Ei A Ej. If we use ij to stand for Ei A Ej and omit the zero 
entries we see that the matrix of [R] is 

01 02 03 23 31 12 

01 2M/r 3 

02 -M/r 3 

[R] = 03 -M/r 3 (9.82) 

23 2M/r 3 
31 -M/r 3 
12 -M/r 3 



We can write this in block three by three form as 

[R] = 

where 



A 
A 





A= | -M/r 3 

-M/r 



3 



On the other hand we have 

*(.Eo A Ei) = E 2 AE 3 

*(E A E 2 ) = £ 3 A£i 

*(£ A £3) = EiAE 2 and 

* 2 = -id. 

Thus the matrix of * relative to the same basis is 

-I 

1 

where I is the three by three identity matrix. Clearly the operator given by 
R on A 2 T(M) commutes with the star operator, as predicted by the general 
theory for any Ricci flat curvature, and we see from the form of the matrix A 
that it is of Petrov type D, with real eigenvalues 2M/r 3 , —M/r 3 — M/r 3 . 



9.5.7 Kerr-Schild form. 

We will show that by making a change of variables that the metric is the sum of 
a flat metric, and a multiple of the square of a linear differential form, a, where 
||a|| 2 = in the flat metric. The generalization of this construction will be 
important in the case of rotating black holes. We make the change of variables 
in two stages: Let 

u = t + T(r) 

where T is any function of r (determined up to additive constant) such that 
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Then 



so 



du = dt+ \dr 
h 



hdt 2 = hdu 2 - 2dudr + \dr 2 

h 



-hdt z + -dr 2 = -hdu 1 + 2dudr 
h 



So if we set 
this becomes 



-(du - dr) 2 + dr 2 + '^du 2 . 

r 



x a := u — r 



-d{ X y + dr 2 -™[dx« + dr\ 2 . 



The form dx° + dr has square length zero in the flat metric 

-d(xO) 2 + dx 2 + y 2 + dz 2 , r 2 = x 2 + y 2 + z 2 
and the Schwarzschild metric is given by 

d(x°) 2 + dr 2 + r 2 da 2 - ™ [dx° + dr] 2 
which is the desired Kerr-Schild form. 

9.5.8 Isometries. 

A vector field X is an infinitesimal isometry or a Killing field if its flow preserves 
the metric. This equivalent to the assertion that 

L X (Y,Z) = ([X,Y} 7 Z) + (Y,[X,Z}) (9.83) 

for all vector fields Y and Z. Now 

X(Y,Z) = (W X Y,Z) + (Y,W X Z) 

and 

V X Y = V Y X + [X,Y] 
with a similar equation for Z. So (9.83) is equivalent to 

(V Y X,Z) + (V Z X,Y) =0. (9.84) 

Let S be a submanifold, N a normal vector field to S, and Y, V, W tangential 
vector fields to S, all extended to vector fields in the ambient manifold. Then 
along S we have the decomposition 

VyY = VyY + II(V, Y) 
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into tangential and normal components. So 



= V(N,Y) 

= (V V N,Y) + (N,II(V,Y)). 

If the submanifold is totally geodesic, so II(V, Y) = 0, we see that 

(VWV,Y> = 0. 

So for any vector field, 

(V V X,W) = (VytanX, W) 

and hence if X is a Killing vector field and S a totally geodesic submanifold 
then the tangential component of X along S is a Killing field for S. 

The curvature of the Schwarzschild plane is 2M/r 3 . So any isometry must 
preserve r since it preserves curvature. Hence it must be of the form 



(t,r) h- {(t>{t,r),r) 

and so carries the vector fields 

Comparing the lengths of <9 r and its image we see that 

§> = 

and comparing the lengths of d t and its image shows that 

dt 

So the only isometrics of the Schwarzschild plane are translations in t, i.e.(i, r) 
(t+c,r). 

Since the planes (at fixed spherical angle) are totally geodesic, this means 
that the tangential component of any Killing vector, Y must be a multiple of 
d t - So the most general Killing field is of the form 



Y = fd t + V 



where V is vertical and / is a function on S 2 . The claim is that / is a constant 
and V does not depend on (t, r) and is a Killing vector for the sphere. In the 
following, U denotes any vector field on the sphere lifted up to be a vertical 
vector field, and u denotes the value of this vector field at some point q £ S 2 . 
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Wc have 

M 

^d r V = -V 
r 

V dr U = -U so 
r 

d r (V,U) = (V dr V,U) + (V,V dr U) 

= -(v,u). 

r 

Solving this equation for a fixed point q e S 2 and fixed tangent vector u at q, 
we see that (at these fixed values) 



(V,U) = g(t)r 2 



Now 



V dt U = so 

(V dt Y,U) = d t (Y,U) 

\JjjY = Ufd t + (• • • )d r + vertical 
so 

(VuY,d t ) = -hUf so 

(V dt Y,U) + (VuY,d t ) = (Killing) implies 

d t (V,U) - 0. 

Again, fixing u, this gives 

g'(t)r 2 = h(r)U.f. 

But no multiple of r 2 can equal any multiple of h(r) = 1 — 2 ^j- unless both 
multiples are zero. So g' — which implies that 

(V,U) = k(U)r 2 . 

But the factor r 2 is what wc multiply the spherical metric by in the Schwarzschild 
metric. Hence this last equation shows that the projection of V onto the sphere 
does not depend on r or t.Thcn it must be a Killing field on S 2 . The condition 
Uf = implies that / is a constant. 

Conclusion: the connected group of isometrics of the Schwarzschild solution 

is 

R x 50(3) 



consisting of time translations and rotations of the sphere. 
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9.6 Robertson Walker metrics. 



These are warped products of an interval / C R (with a negative definite metric) 
and a three dimensional spacclike manifold, S, of constant curvature. So S is 
either the three sphere, Euclidean three dimensional space, or hyperbolic three 
space. We use arc length, t as the coordinate on /, so the total metric has the 
form 

-dt 2 + fd<7 2 

where da 2 is the constant curvature metric on S and / = f(t). We write 

d t f = f, gradf=-/'9 t 
so covariant derivatives are given by 



Vfttft) - 
V 9t V = V v (d t ) = (f/f)V 

TyW = (v,w)(f/f)d t 



(9.85) 
(9.86) 
(9.87) 
(9.88) 



We have H* d t ) := /" and so 

RvoA = (f"/f)V 
Rvwdt = 

R 9t vW = (f"/f)(v,w)d t 



where k 
d = 3 so 



(9.89) 
(9.90) 
(9.91) 

RuvW - [(f/f) 2 + (k/f)][(U,W)V-(V,W)U}. (9.92) 
1,0 or —1 is the constant curvature of S. The fiber dimension is 

3.f" 



Ric (d t ,d t ) = -- 



while Ric (dt , V) — as always in a warped product and 



Rc (V, W) 



k f" \ 

2 7 + U(V. H0 . 



(9.93) 



(9.94) 



Taking the contraction of the Rcci tensor gives the scalar curvature as 

2 

/ 



s = (i 1 1 4 



k_ r 



(9.95) 



and hence the Einstein tensor T = Ric — , ) is given by 
T(F,W) = p(y,W), p:=- 



2T 
/ 
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Also 




) 



2 



3k 



+ 




With these definitions of p and p we can write 



T= (p+ p)dt®dt + p( 



(9.96) 



An energy momentum tensor of type 



t = { P + p)e®6 + pg 



where A is a forward timclikc vector and 



*(■) = <*,■> 



and where p and p are functions, is called a perfect fluid for reasons explained 
in O'Neill. The function p is called the pressure. The fluid is called a dust if 
p = 0. A Robertson Walker model which is a dust is called a Friedman model. 

Let us compute the covariant divergence of the T given by (9.96). We com- 
pute relative to a frame field whose first component is d t and whose last three 
components U\,U2, U% are therefor vertical. The covariant divergence is defined 
to be 



In all situations, the covariant divergent of hg is just dh since V#g = and 



^eiV^T^i,-). 



J2<lh{E i )e i (E i ,-) = dh. 



Hence we obtain, for div T, the expression 



(// + p')dt + (p+p)^2 e ^E t (dt)(Ei)dt + p'dt. 



Now \7 dt (dt)(d t ) = dt{V d A) = °> while 



{Vudt){U) = -dt(VuU) = - 



f 
f 



for any unit vector orthogonal to d t . Thus we obtain 



f'~ 

- p' + 3(p+p)±- dt 



for the covariant divergence. 
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9.6.1 Cosmogeny and eschatology. 

The function 

f 

H: =7 

is called the (Hubble) expansion rate for obvious reasons. The vanishing of the 
covariant divergence of T yields the equation 

p' = -3(p + p)H. (9.97) 

if we go back to the definitions of p and p we see that 

f" 

Now in the known universe, p 3> p > 0, so /" < 0. So the graph of / is convex 
down, i.e. it lies below its tangent line at any point. Let H = H(t ) denote 
the Hubble constant at the present time, t . The tangent line to the graph of / 
at t has slope i?o/(^o) an d hence is given by the equation 

£(t) = f(t )+H f(t )(t-t Q ). 

At t a — Hq 1 the line I crosses the axis. Since / > by definition, this shows 
that the model must fail at some time t* in the past, no more than Hq 1 units 
of time ago. (The current estimates on Hubble's constant give this value as 
somewhere between ten and twenty billion years.) Notice also that if f'(T) < 
at some time in the future then the convex downward property implies that the 
model will also fail at some future time T* . 

For a discussion of further details of the "big bang" and "big crunch" and 
more specifically Friedman models where it is assumed that p = see O'Neill. 



Chapter 10 

Petrov types. 



10.1 Algebraic properties of the curvature ten- 
sor 

The Riemann curvature tensor (R X yZ, W) is anti-symmetric in X, Y and in 
Z, W so can be thought of as a bilinear form on A 2 TM m at any point m of a 
scmi-Ricmann manifold M. It is also invariant under simultaneous interchange 
of X, Y with Z, W so this bilinear form is symmetric. In addition, it satisfies 
the cyclicity condition 

(RxyZ, W) + (RxzW, Y) + (R XW Y, Z) = 0. 

We want to consider the algebraic possibilities and properties of this tensor, 
so will replace TM X by a general vector space V with non-degenerate scalar 
product and want to consider symmetric bilinear forms R on A 2 V which satisfy 

R(v A x, y A z) + R(v A y, z A x) + R(v A z, x A y) = 0. (10-1) 

For example, if V is four dimensional, then A 2 V is six dimensional, and 
the space of symmetric bilinear forms on l\ 2 V is 21 dimensional. The cyclicity 
condition in this case imposes no constraint on R if v is equal to (and hence 
linearly dependent on) x, y or z. Hence there is only one equation on R implied 
by (10.1) in this case. Thus the space of possible curvature tensors at any point 
in a four dimensional scmi-Ricmannian manifold is 20 dimensional. The Ricci 
tensor is the contraction (say with respect to the (1,3) position) of the Riemann 
curvature: 

Ric(i?) = C 13 (R), Ric(i?)(a;, y) := e a R(e a A x, e a A y) 

where the sum is over any "orthonormal" basis. It is a symmetric tensor on 
V. So we can think of Ric as a map from the space of possible curvatures to 
possible Ricci curvatures. If we let 

Curv(T0 C S 2 ((A 2 T0*) 
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denote the subspace of the space of symmetric bilinear forms on A 2 V satisfying 
(10.1). Then 

Ric : Curv(V) -» S 2 (V*). 

Let us show that if dim V > 2 this map is surjective. Indeed, suppose that 
A e S 2 (V*). Let A A A denote the induced symmetric form on A 2 V so that 

(A A A)(u A v, x A y) := A(v, x)A(w, y) — A(v, y)A(w, x). 

Holding v fixed and cyclically summing over w,x,y we get 

A(v, x)[A(w, y)-A(w, y)]+A(v, y)[A(x, w)-A(w, x)]+A(v, w)[A(y, x)-A(x, y)] = 0. 

Thus AAA satisfies (10.1). If A and B are two elements of S 2 (V*) we see that 

A A B + B A A := (A + B) A {A + B) - A A A - B A B 

also satisfies (10.1). Let g e S 2 (V*) denote the scalar product itself. We claim 
that 

Ric(s Ag) = (n-l)g. 

Indeed 

^c(gAg)(v,w) = y^e a ((e a ,e a }(v,w) - (e a ,w)(v,e a )) 

= n(v,w) - ^2e a {v,e a ){e a ,w} 
= (n — l)(v, w). 

For any R G Curv(V) on A 2 (V) define its "scalar curvature" S = S(R) by 
S := e - Ric(i?)(e a , e a ) - C(Bic(R)). 
Also, for any A e S 2 (V*), we have 

C(A) :-^e a Ric( J R)( 

SO 

S(R) = C(Ric(i?)). 

Then 

^2e a (A(e a ,e a )(v,w) - A(e a ,v)(e a ,w)) = C(A)(v,w) - A(v,w) 
^2e a ((e a ,e a )A(v,w) - A(e a ,w)(e a ,v)) = (n-l)A(v,w) 

so 

Ric (g A A + A A g) = (n - 2) A + C(A)g (10.2) 

where n = dim V. Since Ric(gAg) = (n — l)g this shows that Ric : Curv(y) — » 
S 2 (V*) is surjection. We say that R is Ricci flat if Ric(ii) = 0. Thus in four 
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dimensions, the space of Ricci flat curvature tensors (at any point) is ten di- 
mensional. The purpose of this chapter is to explain how the complex geometry 
of spinors leads to a classification of all possible Ricci flat curvatures into five 
types, the Petrov classification published in 1954 in the relatively obscure jour- 
nal Sci. Nat. Kazan State University. In analyzing Petrov type D, Kerr was led 
to his discovery of the rotating black hole solutions of the Einstein equations, 
which generalize the Schwartzschild solution, in 1963. Unfortunately we will 
not have time to study the remarkable properties of this solution. It would take 
a whole semester. 

Let us briefly go back to the general situation where dim V > 2. Let 
R £ Curv(U). Then W defined by 

R = W + ^- 2 ( 9 A Ric(i?) + Ri C (R) A g) - (w _^ff_ 2) g A g 
satisfies 

Ric(VK) = 0. 

It is called the Weyl curvature (or the Weyl component of the Ricmann curva- 
ture.) It is, as was discovered by Hermann Weyl, a conformal invariant of the 
metric. In three dimensions we have dim A 2 = 3 and hence ker Ric = 0, there 
are no Weyl tensors. They exist in four or more dimensions. 

We not turn to the special properties of the curvature tensors in general 
relativity. In what follows, all vector spaces and tensor products are over the 
complex numbers unless otherwise specified. All vector spaces are assumed to 
be finite dimensional. 

10.2 Linear and antilinear maps. 

A map <j> : U — > V between vector spaces is called antilinear if 

4>(aiUi + a 2 u 2 ) = ai4>(u\) + a^4>(u 2 ) Vu\,Ui€:U, oi,a 2 eC. 

The composition of two antilinear maps is linear, and the composition of a linear 
map with an antilinear map (in cither order) is antilinear. 

We let C/ # denote the space of all antilinear functions on U, that is the set 
of all antilinear maps <j> : U — > C. As usual, we let U* , the complex dual space 
of U denote the space of linear maps of U — > C. We have a canonical linear 
isomorphism 

U^(U*)* 

where u <G U is sent to the antilinear function of / £ [/# given by 



Notice that 



f(au) = af(u) = a ■ /(it), 
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so this map of U — > is linear. It is injective and hence bijective since our 

spaces are finite dimensional. 
We define 

U := U** 

so U consists of antilinear functions on U* . 

Given a linear function, £, on an vector space, W, we get an antilinear 
function by composing with the standard conjugation on the complex numbers, 
so 

l = -ot, - C^C 

or 

l(w) = ijw) Vw G W. 

Also, starting with an antilinear function we produce a linear function by com- 
position with complex conjugation. Thus, for example, the most general linear 
function on U* is of the form 

£ i ► t{u) ueU, 
and hence the most general antilinear function on U* is of the form 

But if we write I = m = - o m where m G C/ # , then, considered as a function of 
m this is the assignment 

m i — ► m(u) 

which is a linear function of m. Thus we have a canonical identification 

U := U** = U#*. 

Also __ 

U=(U**)** = U. 

We have an antilinear map u h-> u, U U given by composition with conju- 
gation on C as above, where we think of U as U** . So 

u(l) = I(uj, V£ G U* 

or 

u(m) = m(u), m = i G . 

So 

i(m) = m{u) = £(u) = u**(£) 

and thus 

u = u 

under the identification of U with U. 
We also have 

VWv =u®7 
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as a canonical identification, with 

u ® v = u ® v 

as the map 

— :U®V ~>U®V. 

If 6 is a bilinear form on U, then we can think of b as a bilinear form on U 
according to the rule 

b(u,v) :— b(u,v). 

Indeed, 

b(au, v) = b(au, v) 
= b(au, v) 
= ab(u, v) 
= ab(u, v) , 

and similarly for b(u, av) . 

If b is symmetric or antisymmetric then so is b. 

10.3 Complex conjugation and real forms. 

A complex conjugation of a complex vector space, V, is an antilinear map of V 
to itself whose square is the identity. Suppose that 

"j" i v i — ► 

is such a complex conjugation. Then the set of vectors fixed by f , 

{v\ = v} 

is a real vector space. It is called the real form of the complex vector space, 
V, relative to the conjugation, f. We denote this vector space by K rea j or 
simply by V Teg ^ when f is understood. If v € V^ed! t nen iv satisfies the equation 
= —w (and we might want to call such vectors "imaginary"). Every vector 
iteV can be written in a unique way as 

u = v + iw, "jfS^eai, 

indeed 

1 — i 

v = -(v + t^), w = ~^( v ~ v ^)- 

Familiar examples are: V is the set of all n x n complex matrices and f 
is conjugate transpose. The real vectors are then the self adjoint matrices. 
Another example is to start with a real vector space, E, and then complexify it 
by tensoring with the complex numbers: 



V = £8 R C 
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with 

(x <g)R c) f = x ®r c. 

The corresponding real subspace is then identified with our starting space, E. 
The above remarks about every vector being written as u = v + iw shows that 
any complex vector space with conjugation can by identified with this example, 
i.e. as V = E ® R C where E = V rea j. 

We shall be interested in two other types of examples. Suppose we start 
with a vector space U and construct V = U <g> U. Define complex conjugation 

by 

(u <g> u)t := v <g> u. 

So 

f = SO-®- 

wherc 

s : t7® J7 h-> U® U 

switches the order of the factors. The real subspace is spanned by the elements 
of the form u®u. 

A second example is V = U ®U = U ®U with 

+ 1/) 1 = y + x. 

The real subspace consists of all x + x and hence can be identified with U as 
a real vector space. That is we can consider U as a vector space over the real 
numbers (forgetting about multiplication by i), and this can be identified as a 
real vector space with the real subspace of U + U. For example, suppose that 
g is a symmetric (complex) bilinear form on U. We then obtain a complex 
symmetric bilinear form, g on U and hence a complex symmetric bilinear form, 
g © g on U ® U by declaring U and U to be orthogonal: 



(g ®g)(x + u, y + v) := g(x, y) + g(u, v). 

This restricts to a real bilinear form on the real subspace: 

(g g)(x + x,y + y) = 2Re g(x, y). 

So under the identification of the real subspace of U U with U, the metric 
g © g becomes identified with the real quadratic form 2Re g. Suppose that g 
is non-degenerate, and we choose a (complex) orthonormal basis, 6i, . . . , e n 
for g. So g(ei,ei) = 1 and g(ei,ej) = for i ^ j. This is always pos- 
sible for non-degenerate symmetric forms on complex vector spaces. Then 
ei, . . . , e„, iei, . . . , ie„ is an orthogonal basis for U as a real vector space with 
scalar product Re g and Re g{iek,iek) = — 1- So the metric 2Re g is of type 
(n, n) on the space £/ thought of as a 2n dimensional real vector space. 
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10.4 Structures on tensor products. 

If U and V are (complex) vector spaces, then 

A 2 (U ® V) = S 2 (U) ® A 2 {V) 8 A 2 {U) ® S 2 (V). 
Exterior multiplication is given by 

(ui ® ^i) A (u 2 ® W2) = wi«2 ® ui A w 2 + u\ A tt 2 (E> ^i«2 

where U1U2 denotes the product of u\ and u 2 in the symmetric algebra, and 
similarly for viv 2 . We will want to apply this construction to the case V = U. 

If U has an antisymmetric bilinear form, lu, and V has an antisymmetric 
form, <t, then this induces a symmetric bilinear form on U ® V by 

(til ® ui,U2 ® V2) = w(tii,ti 2 )cr(t;i,t;2). 

We will want to apply this construction to V = U and a = uj. 

The symmetric bilinear induced on U ® V in turn induces a scalar product 
on A 2 (U <g> V) = 5 2 (C/) ® A 2 (V) ® A 2 (C7) ® ^(V) according to the usual rule 

((tti ® vi) A (u 2 ® v 2 ), {u 3 ® u 3 ) A (ti 4 ® u 4 )) = 

= (til ® Ul, «3 ® V 3 )(lt2 ® «2, ti 4 ® W4) — (ui ® t)i, ti 4 ® v 4 )(u 2 ® «2, «3 ® W3) 
= w(lti, U 3 )w(ti2, ti 4 )cr(t)i, U3)o-(t>2, i ' 4 ) - ^("1, tt 4 )w(ttl, U 3 )cr(wi, W 4 )cr(u 2 , U3). 

We can interpret this scalar product as follows, put scalar products on the spaces 
S" 2 (f7) and A 2 (U) according to the rules 

(tiiti2,ti 3 ti 4 ) := * (w(tii,ti 3 )w(ti 2 ,ti 4 ) + u)(ui,U4,)w(u 2 ,Us)) 

and 

(tii A tt 2 , u 3 A ti 4 ) := w(tii, u 3 )lo(u 2 , w 4 ) — w(iti, ti 4 )o;(ti2, M3). 

Make similar definitions for S 2 (V), A 2 (V). Put the tensor product scalar prod- 
uct on S 2 (U) ® A 2 (F) and A 2 (C/) ® S 2 {V). Declare the spaces S 2 {U) ® A 2 (F) 
and A 2 (C/) ® S' 2 (F) in the direct sum, 

A 2 {U ®V) = S 2 (U) ® A 2 (F) A 2 (?7) ® S" 2 ^). 

This direct sum scalar product then coincides with the scalar product described 
above. In particular, when V = U and u = w, and when we think of conjugation 
as mapping S 2 (U) ® A 2 U) 1— ► A 2 ([/) ® S 2 (U), we are in the situation described 
above, of g g, where (/ is the tensor product metric on S 2 (U) ® A 2 (C/). 
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10.5 Spinors and Minkowski space. 

Let U be a two dimensional complex vector space with an antisymmetric non- 
degenerate bilinear form, uj. Then we get a symmetric bilinear form on 
Let us check that the restriction of this symmetric form to the real subspace is 
real, and is of type (1,3). To see this, let u be any non-zero element of U, and 
let v be some other vector with 

*(«,«) = i=. 

Then u <g> w is a null vector of L/ <g> J7 for any to, since u(u, u) = 0. Then 
((u<g)TL + v <giv, u®u + v<&v) = 2w(u, v) 2 = 1. 

Also 

(u ®u — v ®v,u ®u — v ®v) = —1 

(i(u®v — v®u),i(u®v — v®u)) = —1 

and the vectors u ®u + v ®v 7 u®u — v®v, u®v + v®u, i(u ® v — v ® u) are 
mutually orthogonal, and span the real subspace. 

Let a:=iiAi). So a can be characterized as the unique element of A 2 C7 
satisfying u)(a) = \. Then 

u®uA(u®v + v®u) = u 2 ®a + a®u 2 . 

This clement of A 2 T, where T is the real subspace oiU®U is the wedge product 
of a null vector, u®u and a spacelike vector orthogonal to the null vector. Hence 
it corresponds to a "null plane" containing the null vector u®u. 

Thus each non-zero u G U determines a null vector, u ® u, and a "null 
plane", Q u , corresponding to the decomposable clement u 2 ®a + a®u 2 . Multi- 
plying u by a phase factor, e %0 multiplies u by e~* e and hence does not change 
the null vector u®u. But it changes the null plane since u 2 i— » e 2l0 u 2 . Geometri- 
cally, this amounts to replacing w by e^i* and so rotates the vector u®v + v®u 
by 26'. So Q e ie u is obtained from Q u by rotation through angle 29. 

We can compute the star operator in terms of the orthonormal basis con- 
structed above from u and v, and find by direct computation that *{u 2 ® a) = 
±iu 2 ® a (the same choice of sign for all u) . Since the sign of the star oper- 
ator is determined by the orientation, we can choose the orientation so that 
■k(u 2 ® a) = iu 2 ®aVu£ U, and hence the decomposition 

A 2 (C/ ® U) = S 2 (U) ® A 2 (I7) 8 A 2 (C/) ® 5 2 (I7) 

is the decomposition into the +i and — i eigenspaces of (the complcxification of) 
star on A 2 (U ®U) = A 2 T Or C. 
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10.6 Traceless curvatures. 

If we use a and a to identify A 2 (U) and A 2 (U) with C, we then can write 

A 2 (U®U)) = S 2 (U)®S 2 (U), 

as the decomposition into the ±i eigenvalues of the star operator. Then 

S 2 (A 2 (U ® C7))_ = S 2 (S 2 (U)) S 2 (S 2 ([7)) 

is the —1 eigenspace of the induced action of ★ on S 2 (/\ 2 (U <g> U). The complex 
conjugation is the obvious one coming from the complex conjugation U —> U . 
Thus we may identify the space of (real) —1 eigenvectors of * on S 2 (A 2 (T)) with 
S 2 (S 2 (U)) considered as a real vector space. 

The space S 2 (S 2 (U)) is six dimensional (over the complex numbers). It has 
an invariant five dimensional subspace, S 4 (U), the space of quartic polynomials 
in elements of U. We can also describe this subspace as follows: we can use the 
quadratic form on S 2 (U) and on S 2 (S 2 (U)) to define a map 

ft: S 2 (S 2 (t/)) ^End S 2 (U), 

(v(t)s 1 ,s 2 ) = (t,s 1 -s 2 ), teS 2 (S 2 (U)), s l7 s 2 eS 2 (U), 

and where si • s 2 G S 2 (S 2 (U)). This identifies S 2 (S 2 (U)) with the space of all 
symmetric operators on S 2 (U), symmetric with respect to the quadratic form on 
S 2 (U). The map 1 1— > tr /^.(i) is a linear form which is invariantly defined. Since 
Sl(U) acts irreducibly on S 4 (U), the restriction of this linear form to S 4 (U) 
must be zero, so we can think of S 4 (U) as consisting of traceless operators. Up 
to an inessential scalar, we can consider the restriction of \i to S 4 (U), call it u, 
characterized by 

(v(t)s-t,s 2 ) = (t,SiS 2 ), 

where s\s 2 € S 4 (U) is the product of s± and s 2 in the symmetric algebra, and 
the scalar product on the right is the scalar product in S 4 (U). 

10.7 The polynomial algebra. 

It will be convenient to deal with the entire symmetric algebra, S := S(U), 
where S k denote the homogeneous polynomials of degree k. For any u ^ e U, 
let us now choose w such that ui(u, w) = 1, and define the derivation on S 

l(u) : S k -» S*^ 1 

by 

(.(u)z = o>(u, z) Vz e {/ 

which defines it on generators and hence determines it on all of S. The com- 
mutator of any two derivations is a derivation, and the commutator [i(u), i(u')] 
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vanishes on S 1 and hence on S for any pair of vectors u and u'. Thus all 
derivations l{u) commute, and hence u i— ► extends to a homomorphism 

i : S -» End 5. 

This allow us to extend w to a bilinear form on 5 by 

(a,t) := [i(s)t]o 

where the subscript denotes the component in degree zero. So the spaces S 
and S are orthogonal with respect to this bilinear form, and the restriction to 
S x S is symmetric when k is even, and antisymmetric when k is odd. 
We can write the operator v(t),t £ S 4 as 

(v(t)s 1 ,S 2 ) = t(i)(siS 2 ), S!,S 2 & S 2 . 

Since every quartic homogeneous polynomial in two variables is a product of four 
linear polynomials, t = U1U2U3U4, we can use this formula and the derivation 
property to describe the operator v(t). 

10.8 Petrov types. 

For example, suppose that t = u 4 , i.e. all four factors arc identical. Then 
i(u 4 )u k w 4 - k = 0, for k ± and l(u) 4 w 4 = 12. Hence 

v(u 4 )u 2 = v(u 4 )uw = 0, v{u 4 )w 2 = 6u 2 . 

Thus for any non zero u £ U, the operator v(u 4 ) is a rank one nilpotent operator 
with image Cm 2 . 

Suppose that three of the factors of t are the same, and the fourth linearly 
independent. So we may assume that t — u 3 w for u,w € U with u>(u,w) = 1. 
Then 

L(u 3 w)u k w n ~ k = 0, k ^ 1 

and 

l(u 3 w)uw 3 = — 1. 

So 

v{u 3 w)u 2 — 0, v{u 3 w)uw e Cm 2 , v(u 3 w)w 2 € Cmu. 

Thus i^(m 3 m;) has kernel Cu 2 and image the plane spanned by w 2 and uw in 5 12 . 
The image of this plane is the kernel, so v(u 3 w) is a two step nilpotent operator. 

Next consider the case where u\ = u 2 , U3 = 114, u 2 M3, all not zero. The 
non-zero value of ui(u\,us) is an invariant. But we can always multiply our 
element t by a scalar factor, to arrange that this value is one. So up to scalar 
multiple we have t — u 2 w 2 for 7^ u e U, u(u, w) = 1. Then 

l(u 2 w 2 )u k w k = 0, k^2, i{u 2 w 2 )u 2 w 2 = 4. 
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Our current choice of the normalization of the scalar product on S 2 yields 

(u 2 ,w 2 ) = (w 2 ,u 2 ) = 2, (uw,uw) = —1 

all other scalar products equal zero for the basis u 2 , uw, w 2 of S 2 . Hence follows 
that l(u 2 w 2 ) is diagonizable with eigenvalues —4,2,2: 

v(u 2 w 2 )u 2 = 2u 2 , v(u 2 w 2 )w 2 = 2w 2 , v{u 2 w 2 )uw — —4uw. 

Suppose that exactly two factors are equal. We can assume that the two 
equal factors are u. Multiplying by scalars if necessary we can arrange that 
u)(u, u 3 ) = oj(u, i* 4 ) = 1. Son 3 =ra + au, u 4 = w + bu, a^b. Replacing w by 
w — \ (a+ b)u we may write 

t = u 2 (w + ru)(w — ru) = u 2 w 2 — r 2 u A , r^O, 

where we have r = \{a — b). Now the semisimple element v{u 2 w 2 ) commutes 
with the rank one nilpotent clement v(u A ) since 










1 







1 


-4 


• 


, v{u A ) = 







\ 







V 





/ 



in terms of the basis u 2 ,uw, w 2 of S 2 (U) In fact, the form of these two matrices 
shows that the operator v(u 2 w 2 — r 2 u 4 ) is not diagonizable. 

Finally, the generic case of four distinct linear factors corresponds to the 
generic case of three distinct eigenvalues. We thus have the various Petrov 
types for non-zero elements: 

name # linear factors structure of v{t) 



I 




4 distinct 


distinct eigenvalues, diagonalizable 


II 




3 distinct 


2A, — A, — A, non-diagonalizable 


D 


Ul 


= u 2 ^u 3 = u 4 


2A, —A, —A, diagonalizable 


III 


Ul 


= U 2 = U 3 ^ lt 4 


nilpotent, rank 2 


N 


Ul 


= U 2 = U 3 = 1t 4 


nilpotent, rank one 



10.9 Principal null directions. 

We have identified the space S 2 — S 2 (U) with the +i eigenspace of * acting on 
A 2 T<g> C. The map a ^ a — i*aisa real linear identification of A 2 T with this 
eigenspace, under which multiplication by i is pulled back to the star operator. 
So an element a corresponds to a null vector in S 2 if and only if it satisfies 
a A a — and (a, a) = 0, and so determines a null plane which is degenerate 
under the restriction of the Lorentz scalar product. Such a null plane contains 
a unique null line. We can describe this null plane and null line in terms of 
S 2 as follows. The null elements of S 2 are those elements which are squares of 
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linear elements. Indeed, every element of S 2 can be factored into the product 
of two linear factors, say as uv, and if v is not a multiple of u then (uv, uv) ^ 0. 
So the null bivectors in A 2 T correspond to elements of the form u 2 , and the 
corresponding null line in T is the line spanned by u <£> u. If u 2 satisfies 

(v(t)u 2 ,u 2 ) = 

then a is called a principal null bivector and its corresponding null line is called 
a principal null line, and a non-zero vector in a principal null line is called a 
principal null vector. If a = u 2 then we say that u is a principal spinor. 

Projectively, the two quadric curves (a, a) = and (v(t)a,a) = will 
intersect at four points, but these points may coalesce to give multiple points 
of intersection. 

The multiplicity, m, of a principal null vector I — u ® u is defined to be 

• m = 1 if u 2 is not an eigenvector of v(t), 
to = 2 if u 2 is an eigenvector of v(t) with non-zero eigenvalue, 
to = 3 if v(t)u 2 = 0, dim ker v(t) = 1, 

• m = 4 if v(t)u 2 = and dim ker v{t) = 2. 

The condition for u to be a principal null spinor can be written as 

i(t)u 4 = 0. 

If we write t as a product of linear factors, t = U1U2U3U3 we see that this is 
equivalent to saying that 11 = Uj (up to a constant factor), i.e. that u be a 
factor of t. If we now go back to the previous section and examine each of 
the normal forms we constructed for each type, we see that the factorization 
properties defining the type of t also give the multiplicities of the principal null 
vectors. So type I has four distinct principal null vectors each of multiplicity 1 , 
type II has one principal null vector of multiplicity 2 and two of multiplicity 1 , 
type D has two principal null spinors each of multiplicity two, type III has one 
of multiplicity 3 and one of multiplicity 1, and type N has one principal null 
vector of multiplicity 4. In symbols: 

/ & (1,1,1,1) 
II (2,1,1) 
D (2,2) 

J/J <£> (3,1) 

JV ^ 4. 



Here is another description of the multiplicity of a null vector, k = u®u. 
The element u 2 corresponds to a bivector a = k A x where x is some spacelike 
vector perpendicular to k. To say that k is principal is the same as to say that 
g(v(t)a, a) — where g is the complex scalar product pulled back to A 2 T. The 
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real part of g is just the original scalar product so (v(t)a, a) = 0. Since we can 
multiply u and a by an arbitrary phase factor, the condition of being principal 
is that 

{v{t)k A x,k A x) = Vx_Lfc. 

Writing 

(v(t)k Ai,fcAi) = (R kx k, x) 
and then polarizing, we see that this is the same as saying that 

(R kx k,y) = Vx,y±k. (10.3) 

We claim that the null vector is principal with multiplicity > 2 if and only if 

(R kx k, y) = 0, \/x ±k and Vy. (10.4) 

Proof. Suppose that k — u <g> u is a factor of order at least two in t. This 
happens if and only if i(t)u 3 v = t(t)u 4 = 0. This is the same as saying that 
v(t)u 2 is orthogonal to the complex two dimensional space u 21 - relative to the 
complex metric. This complex two dimensional space corresponds to a real four 
dimensional space, the orthogonal complement of the two dimensional subspace 
of A 2 T spanned by k A x = u 2 and k A z = iu 2 . Here x and z are spacelike 
vectors orthogonal to k and to each other as above. So u is a repeated factor of 
t if and only if 

([R]{kAx), 1 )=Q 

for all 7 in this four dimensional subspace of A 2 T and similarly for z. The 
four dimensional space in question is spanned by the three dimensional space of 
elements of the form kAy, y G T and the element xAz. In particular, applied to 
elements of the form x Ay we get condition (10.4) for the x we have chosen and 
also for x replaced by z. It is automatic with x replaced with k since Rkk = 0. 
This proves that (10.4) holds if u is a repeated factor. 

To prove the converse, we must show that [R](k A x) is orthogonal to the 
four dimensional subspace of A 2 T spanned by all k A y and x A z. Condition 
(10.4) guarantees the orthogonality for the elements of the form k A y. So we 
must prove that 

(R kx x,z) = 0. 

This will follow from the Ricci flatness condition. Indeed, choose a null vector 
I orthogonal to k,x and z with (k,£) = 0. Then 

= Ric[R]{k,z) = Y,9 ii (Rk Vi z,y j ) 

ij 

as yi, yj range over the elements k, £, x, z. This sum reduces to 

-(R k ez,k) + (R kx z,x) 

all other terms vanishing. The first term vanishes by (10.4) and this implies the 
vanishing of the second. 
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10.10 Kerr-Schild metrics. 

We want to use (10.4) to conclude that if a Ricci fiat metric is obtained from a 
flat metric by adding the square of a null form to a flat metric, then the tracefree 
curvature has a repeated factor. More precisely, if the null form is a, and the 
corresponding vector field is N, then we will show that this repeated factor is 
u where u®u — N. For this we recall the following facts, 

• A necessary condition for the new metric to be Ricci flat is that 



V N N = <f>N 



for some function <j). Thus the integral curves of N are null geodesies in 
the old (and new) metric but with a possibly non-affine parametrization. 

• . The new affine connection is differs from the old affine connection by 
adding the a tensor A <E T <g) S 2 T* which can be expressed in terms of the 
null form. That is, the new connection is 



VxY + A x Y 



where V is the old connection and we can write down a formula for A x Y 
involving the null form and its covariant derivatives. In particular, 

A N = <j>N®a (10.5) 



i.e. 

A N (X) = 4>a(X)N. 

Also 

a(A.-) = <pa®ot 

i.e. 

a(A x Y) = <t>a{X)a{Y). (10.6) 

If the affine connection is modified by the addition of a tensor A, then the 
new curvature differs from the old curvature by 

R'xy = Rxy + [A x , A Y ] + (VA)(X, Y) - (WA)(Y, X). 

Here 

(VA)(X,Y) G Horn (T,T) 

is defined by 

(VA)(X, Y)Z = Vx(A Y Z) - A VxY Z - A Y V X Z 



where V is the connection relative to the old metric. 
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In our case the old curvature is zero and we are interested in computing 

(R' NX N,Y) = -a(R NX Y). 

We have A^N — and AjyAx N ~AnAnX — 0so the bracket term makes no 
contribution. Since N is a null vector field, we have Vx^ -L N and so 

A VxN N = A N V X N = 

for any X, and A^N = so Vx(AnN) = for any X. So we are left with the 
formula 

a{R' NX Y)=a{{V N A) x Y). 

Now 

a(V N A) = V N (a(A)) - (V N a)(A) = -(V N <j) + 4> 2 )a ® a 

or 

a(R NX Y) = -(V N <b + 4> 2 )a(X)a(Y). (10.7) 

In particular, if X _L N so a(X) = 0, the preceding expression vanishes for any 
Y proving that N is a principal null vector of multiplicity at least two. QED 
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Chapter 11 

Star. 



11.1 Definition of the star operator. 

We start with a finite dimensional vector space V over the real numbers which 
carries two additional pieces of structure: an orientation and a non-degenerate 
scalar product. The scalar product, ( , ) determines a scalar product on each 
of the spaces A k V which is fixed by the requirement that it take on the values 

(xi A • • • Ax k ,yi A • • • Aj/fe) = det ((x^yj)) 

on decomposable elements. This scalar product is non-degenerate. Indeed, 
starting from an "orthonormal" basis e\, . . . , e n of V, the basis A- • -Ae^ , i\ < 
■ ■ ■ < ik is an "orthonormal" basis of A fe where 

(e h A • •• Ae ifc ,e n A •• • Ae,J = (-l) r 

where r is the number of the ij with (e^ ,ei } ) = —1. 

In particular, there are exactly two elements in the one dimensional space 
A"V, n = dim V which satisfy 

(v,v) = ±1. 

Here the ±1 is determined by the signature (p,q) ( p pluses and q minuses) of 
the scalar product: 

(v,v) = (-iy. 

An orientation of a vector space amounts to choosing one of the two half 
lines (rays) of non-zero elements in l\ a V . Hence for an oriented vector space 
with non-degenerate scalar product there is a well defined unique basis element 

v € /\ n V (v,v) = (-l) q . 

Wedge product always gives a bilinear map from A h V x h n ~ k V — > h n V But 
now we have a distinguished basis element for the one dimensional space, A n V. 
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The wedge product allows us to assign to each element of A € /\ k V the linear 
function, t\ on A n ~ k V given by 

\Auj = £ x (uj)v VweA"^. 

But since the induced scalar product on A n ~ k V is non-degenerate, any linear 
function £ is given as £(u>) = (t,lo) for a unique r = t(£). So there is a unique 
clement 

★A G A n ~ k V 

determined by 

A A lu = (*\,u)v. (11.1) 

This is our convention with regard to the star operator. In short, we have 
defined a linear map 

* : A k V -» A"- fe y 

for each < k < n which is determined by (11.1). 

Let us choose an orthonormal basis of V as above, but being sure to choose 
our orthonormal basis to be oriented, which means that 

v = ei A • • • e n . 

Let I = (ii, . . . , ik) be a k— subset of {1, . . . , n} with its elements arranged in 
order, i\ <■■■< ik so that the 

e/ := e tl A • • • A e ik 

form an "orthonormal" basis of l\ k V . Let I c denote the complementary set of 
/ C {1, . . . , n} with its elements arranged in increasing order. Thus ej<= is one 
of the basis elements, {ej} where J ranges over all (n—k) subsets of {1, ... , n}. 
We have 

ei A ej = if J ^ I c 

while 

ei Ae/c = (-1)^ 

where (— l) 71 " is the sign of the permutation required to bring the entries in 
ei A e/c back to increasing order. Thus 

*ej = (-l)"+ r ( r W (11.2) 

where (-1)^ := (-l)^(-l)'- an d 

r(J) is the number of j e J with (ej,ej) = — 1, 

(-lyW = { ej,ej). (11.3) 

We should explicate the general definition of the star operator for the ex- 
treme cases k = and k — n. We have A°V = R for any vector space V, 
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and the scalar product on R is the standard one assigning to each real number 
its square. Taking the number 1 as a basis for R thought of as a one dimen- 
sional vector space over itself, this means that (1,1) = 1. Wedge product by 
an element of /\°V — R is just ordinary multiplication by of a vector by a real 
number. So, 

uAl = lAu = t) 

and the definition 

v A 1 = (*v, l)v 

requires that 

*v = 1 

no matter what the signature of the scalar product on V is 
★1 = ±v. We determine the sign from 

1 A v = v = (*1, v)v 

so 

*l = {v,v)v = {-l) q v (11.5) 

in accordance with our general rule. 

Applying * twice gives a linear map of l\ k V into itself for each k. We claim 
that 

* 2 = (_l)M«-fc)+9 id . ( 1L6 ) 

Indeed, since both sides are linear operators it suffices to verify this equation on 
basis elements, e.g. on elements of the form ej, and by relabeling if necessary 
we may assume, without loss of generality, that I = {1, . . . , k}. Then 

*(ei A • • • A e k ) = (-l) r(r) e k+1 A • • • A e n , 

while 

*(e fc+1 A • • • A en) = (-l) fe ("- fc )+^) ei A • • • A e k 

since there are n — k transpositions needed to bring each of the e^, i < k, past 
e fe+ i A • • • A e n . Since r(I) + r(I c ) = q, (11.6) follows. 

11.2 Does ★ : A k V — > A n ~ k V determine the met- 
ric? 

The star operator depends on the metric and on the orientation. Clearly, chang- 
ing the orientation changes the sign of the star operator. 

Let us discuss the question of when the star operator determines the scalar 
product. We claim, as a preliminary, that it follows from the definition that 

AA*u = (-1)«(A,w)d V A, a; € A fe (11.7) 

for any < k < n. Indeed, we have really already verified this formula for the 
case k = or k = n. For any intermediate k, we observe that both sides are 



(11.4) 

On the other hand, 
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bilinear in A and u>, so it suffices to verify this equation on basis elements, i.e 
when A = ej and uj — ex where I and K are k— subsets of {1, ... , n}. If K ^ I 
then (e/,eif) = 0, while K c and I have at least one element in common, so 
ei A -keK — 0. Hence both sides equal zero. So we must only check the equation 
for I = K, and without loss of generality we may assume (by relabeling the 
indices) that 7={l,2,...,fc}. Then the left hand side of (11.7) is 

while the right hand side is (-l)«+ r ( J )u by (11.2). Since q = r(I) + r(I c ) the 
result follows. 

One might think that (11.7) implies that * acting on A k V, k ^ 0,n deter- 
mines the scalar product, but this is not quite true. Here is the simplest (and 
very important) counterexample. Take V = R 2 with the standard positive def- 
inite scalar product and k = 1. So * : A 1 V = V — > V . In terms of an oriented 
orthonormal basis we have *ei = e 2 ,*e 2 = — e 1; thus * is (counterclockwise) 
rotation through ninety degrees. Any (non-zero) multiple of the standard scalar 
product will determine the same notion of angle, and hence the same * operator. 
Thus, in two dimensions, the * operator only determines the metric up to scale. 

The reason for the breakdown in the argument is that the v occurring on the 
right hand side of (11.7) depends on the choice of metric. It is clear from (11.7) 
that the star operator acting on A k V determines the induced scalar product on 
h k V up to scale. Indeed, let ( , )' denote a second scalar product on V. Let v' 
denote the element of h n V determined by the scalar product ( , )', so 

v' = av 

for some non-zero constant, a > 0. Finally, for purposes of the present argument, 
let us use more precise notation and denote the scalar products induced on h k V 
by ( , )k and ( , )' fe . Then (11.7) implies that 

(, )'k = l(, >fc- (11-8) 

For example, suppose that we know that the original scalar products on V differ 
by a positive scalar factor, say 

(,)' = c(,), c>0. 

Then 

(, )' k = c k (, ) 

while 



since (v,v)' n — c n (v,v). Hence the fact the the star operators are the same on 
f\ k V implies that c = 1 for any k other than k = This was exactly the point 
of breakdown in our two dimensional example where n = 2, k = 1. 
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In general, if ( , ) is positive definite, and ( , )' is any other non-degenerate 
scalar product, then the principal axis theorem (the diagonalization theorem for 
symmetric matrices) from linear algebra says that we can find a basis ei, . . . , e„ 
which is orthonormal for ( , ) and orthogonal with respect to ( , )' with 

i^-ii e i) — Sii S{ ^ 0. 

Then 

(ej, ej)' — Sjj • • • s ik (e I ,e I ), I = . . . , i k }. 

The only way that (11.8) can hold for a given < k < n is for all the Si to be 
equal. Let s denote this common value of the s^. Then a = \s\~ n / 2 and we can 
conclude that s = ±1 if k =/= n/2 and in fact that s = 1 if, in addition, k is odd. 

I don't know how to deal the case of a general (non-definite) scalar product 
in so straightforward a manner. Perhaps you can work this out. But let me deal 
with the case of importance to us, a Lorentzian metric on a four dimensional 
space, so a metric of signature (1,3) or (3, 1). For k = 1, we know from the above 
discussion that the star operator determines the metric completely. The case 
k = 3 reduces to the case k = 1 since * 2 = (— l) 31+1 id =id in this degree. The 
only remaining case is k = 2, where we know that ★ only determines ( , ) 2 up to 
a scalar. So the best we can hope for is that * : h 2 V — > h 2 V determines ( , ) up 
to a scalar multiple. The following proof (in the form of exercises) involves facts 
that will be useful to us later on when we study curvature properties of black 
holes, so we will need them anyway. What we are trying to prove is that (in our 
situation of a Minkowski metric in four dimensions) the equality ( , ) 2 — b( , ) 2 
for some 6^0 implies that ( , )' = s( , ) for some s^O: 



1. Show that the metric ( , ) 2 induced on h 2 V from the Minkowski metric 
( , ) on V has signature (3, 3). (It doesn't matter for this result whether we use 
signature (1,3) or or (3, 1) for our Minkowski metric.) 



2. Let u, v e V. Show that (u A v, u A v) 2 = if and only if the plane, P{ u . v } 
spanned by u and v is degenerate,i.e. the restriction of ( , ) to P{ u . v j is singular. 
This means that {0} ^ P^ u v ^ n P{ u , v }- Now P^ u v ^ ^ P{u,v} since there are no 
totally null planes in V. So P^ u ^ n P{ u .v} 1S a nn c consisting of null vectors, 
that is of vectors n satisfying (n, n) = 0. Show that all other vectors in Ps u , v } 
are spacelike. That is, if w G P{ u ,v}i w & P{ u v j then (w,w) < if we use the 
signature (1,3) or (w,w) > if we use the signature (3, 1). Conversely, if n is 
any non-zero nullvector and w is any spacelike vector perpendicular to n then 
the plane spanned by n and w is a degenerate plane so that (u A v,u /\v) 2 = 
for any pair of vectors spanning this plane. 
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Notice that if v! , v' is some other pair of vectors spanning the plane Ps u , v }i 
then u' Av' = buAv for some scalar b ^ 0. Conversely, if u' A v' — bu A v, b^ 0, 
then u',v' span the same plane, P{ u ,v} as do u and v. So every line of null 
decomposable bivectors (i.e. a line of the form {ru A v}), (u A v,u A v)2 = 
determines a line of of null vectors, {cn}. Conversely, if we start with the line 
{cn} of null vectors, let 

Qn ■= n 1 - 

be the orthogonal complement of n. It is a three dimensional subspace of V 
containing n; all elements of Q n not lying on the line {cn} being spacelike. The 
choice of any spacelike vector, w, in Q n , say with \(w, w}\ — 1 then determines 
a degenerate plane containing n and lying in Q n . We thus get a whole "circle" 
of null planes P with 

{0} C {cn} cPcQ n CV. 

In general, a chain of increasing subspaces is called a "flag" in the mathematical 
literature. If the dimensions increase by one at each step it is called a "complete 
flag" . What we have here is that each uAv with {u Av,u Av) 2 = determines 
a special kind of complete flag, starting with a line of null vectors. (Penrose 
uses the following picturesque language: he calls {cn} the flagpole about which 
the plane P rotates.) All this is overkill for our present purpose, but will be 
needed later on. What we do conclude for our current needs is that the cone 
of null bivectors, {uo G A 2 V| (w, w)2 = 0} determines the cone of null vectors, 
N := {w e V\(w, w) = 0}. So we can conclude the proof with the following: 



3. Let W be any vector space with a non-degenerate scalar product ( , ) of 
type (p, q) with p ^ 0, q ^ and let N := {w e W\ (w, w) — 0} be its null cone. 
If ( , )' is any other (non degenerate) scalar product with the same null cone 
then ( , )' = s( , ) for some non-zero scalar, s. 



So we now know that in our four dimensional Minkowski space, a knowledge 
of * : h 2 V — > h 2 V determines the metric up to scale. Here are some more 
special facts we will need later. 



4. Show that * : A 2 V — > A 2 V is self adjoint relative to ( , ) 2 , i.e. 

(★A, uj) = (A, *u) VA,w6 A 2 V. 



The next three problems relate to the discussion in Chapter IX. It follows 
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from our general formula that * : /\ 2 V — > A 2 y satisfies * 2 = — id. This means 
that * : A 2 V — > A 2 F has eigenvalues i and — i. In order to have actual eigen- 
vectors, we must complexify. So we introduce the space 

A 2 V C := A 2 V® C. 

An element of A 2 Vc is an expression of the form X+iui, A, o> e A 2 V. Any linear 
operator on A 2 V automatically extends to become a complex linear operator on 
A 2 Vc- For example *(A + iui) := *A + i * lo . Similarly, every real bilinear form 
on A 2 7 extends to a complex bilinear form on A 2 Vc- For example, ( , ) = ( , )2 
(we will now revert to the imprecise notation and drop the subscript 2) extends 
as 

(A + iuj, t + ip) := (A, r) + i{u>, r) + p) - (uj, p) . 
The subspaces 

A 2 V(t := {X - i* \} A 2 V c := + A G A 2 V 

are complex linear subspaces which are the +i and — i cigenspaces of * on A 2 Vc- 
They are each of three complex dimensions and 

A 2 Vc = h 2 V+® A 2 V C . 

In the physics literature they have the unfortunate names of the space of "self 
dual" and "anti-self dual" bi vectors. 



5. Show that these two subspaces arc orthogonal under (the complex extension 
of) ( , )• 



The (real) vector space A 2 V^ has dimension 6. Hence the space of symmetric 
two tensors over h 2 V , the space S 2 (A 2 V) has dimension 6 • 7/2 = 21. The 
operator * : A 2 V^ — » l\ 2 V induces an operator (shall we also denote it by *?) of 
S 2 (A 2 V) — ► S 2 (A 2 V). The eigenvalues of this induced operator will be all pos- 
sible products of two factors of either i or — i, so the eigenvalues of the induced 
operator * : S 2 (A 2 V) — > 5 2 (A 2 y) are ±1. The corresponding eigenspaces are 
now real. 



6. Show that the dimension of the —1 cigenspace is 12 and the dimension 
of the +1 cigenspace is 9. (Hint: The dimensions of real eigenspaces do not 
change if we complexify and then consider dimensions over the complex numbers 
with the same real eigenvalues of the complexified operator. Describe the space 
S 2 (W10H2), the symmetric two tensors over a direct sum of two vector spaces.) 
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The reason that the preceding problem will be of importance to us is that 
the curvature tensor R at any point of a Lorentzian manifold can be thought of 
as lying in S 2 (h 2 V) where V = TM*, the cotangent space at a point. Actually, 
one of the Bianchi identities (the cyclic sum condition) imposes one additional 
algebraic constraint on the curvature tensor so that R lies in a 20 dimensional 
subspace of the 21 dimensional space S 2 {h 2 V). The Einstein condition in free 
space will turn out to further restrict R to lie in an eleven dimensional subspace 
of the twelve dimensional space of —1 eigenvectors, and the more stringent 
condition of being Ricci flat will restrict R to lie in a ten dimensional subspace 
of this eleven dimensional space. We will spend a good bit of time studying this 
ten dimensional space. 

11.3 The star operator on forms. 

If M is an oriented scmi-Rimannian manifold, we can consider the star operator 
associated to each cotangent space. Thus, operating pointwise, we get a star 
operator mapping fc— forms into (n — fc)forms, where n — dim M: 

*-.n k (M) ->n n - k (M). 

Many of the important equations of physics have simple expressions in terms 
of the star operator on forms, the purpose of the rest of these exercises is to 
describe some of them. In fact, all of the equations we shall write down will be 
for various star operators of flat space of two, three and four dimensions. But 
the general formulation goes over unchanged for curved spaces or spacetimes. 

11.3.1 For R 2 . 

We take as our orthonormal frame of forms to be dx, dy and the orientation two 
form to be v := dx A dy. Then 

■kdx — dy, -kdy — —dx 

as we have already seen. 



7. For any pair of smooth real valued functions u and v, let 

lo := udx — vdy. 

Write out the pair of equations 

d*w = 0, du = (11.9) 

as a system of two partial differential equations for u and v. (We will find later 
on that Maxwell's equations in the absence of sources has exactly this same 
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expression, except that for Maxwell's equations 10 is a two form on Minkowski 
space instead of being a one form on the plane.) If we allow complex valued 
forms, write / = u + iv and dz = dx + idy then the above pair of equations can 
be written as 

d[fdz] = 0. 

It then follows from Stokes' theorem that the integral of fdz around the bound- 
ary of any region where / is defined (and smooth) must be zero. This is known 
as the Cauchy integral theorem. Notice that 

fdz = lo + i -ku 

is the anti-self dual form corresponding to u in the terminology of the preceding 
section. 



11.3.2 For R 3 . 

We have the orthonormal coframe field dx,dy,dz, with v = dx A dy A dz, so 
★1 = v, 

■kdx — dy Adz 
■kdy = —dx A dz 
*dz = dx hdy 



with 

in all degrees. Let 
Let 



* 2 = 1 



cP_ cP_ 

dx 2 dy 2 dz 2 ' 



9 = adx + bdy + cdz 
S! = Adx A dy + Bdx Adz + Cdy A dz. 



8. Show that 

*d*dQ - d*d*9 = -(Aa)dx - (Ab)dy - (Ac)dz (11.10) 

and 

-*d*dn + d*d*fl = -(AA)dx Ady- (AB)dx A dz - (AC)dyAdz. (11.11) 
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11.3.3 For R 13 . 

We will choose the metric to be of type (1,3) so that we have the "orthonormal" 
coframe field cdt, dx, dy, dz with 

(cdt, cdt) = 1 

and 

(dx,dx) — (dy,dy) — (dz,dz) = —1. 

We will choose 

v = cdt A dx A dy A dz. 

This fixes the star operator. But I am faced with an awkward notational prob- 
lem in the next section when we will discuss the Maxwell equations and the 
relativistic London equations: We will want to deal with the star operator on 
R 3 and R 1,3 simultaneously, in fact in the same equation. I could use a sub- 
script, say *3 to denote the three dimensional star operator and * 4 to denote 
the four dimensional star operator. This would clutter up the equations. So I 
have opted to keep the symbol * for the star operator in three dimensions and 
for the purpose of the rest of this section only, use a different symbol, £. for the 
star operator in four dimensions. So 

Jlt(cdt A dx A dy A dz) = 1, 4>1 = —cdt A dx A dy A dz 

while 

jtcdt = —dx Ady Adz 
Htdx = —cdt Ady Adz 
jtdy = cdt A dx A dz 
Htdz = —cdt A dx Ady 

which we can summarize as 



and 



Jltcdt = — ★ 1 

»,() = -cdtA*6 for 
9 = adx + bdy + cdz 



Jltcdt A dx = dy A dz 

Jltcdt A dy = —dx A dz 

Htcdt Adz = dx A dy 

Htdx A dy = —cdt A dz 

Htdx Adz = cdt A dy 

Htdy Adz = —cdt A dx. 
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Notice that the last three equations follow from the preceding three because 
ft 2 = —id as a map on two forms in R 1 ' 3 . We can summarize these last six 
equations as 

*(cdf A 9) = *6, *fi = -cdt A *fi. 

I want to make it clear that in these equations 9 = adx + bdy + cdz where the 
functions a, 6, and c can depend on all four variables, t, x, y and z. Similarly fl 
is a linear combination of dx A dy, dx A dz and dx A dz whose coefficients can 
depend on all four variables. So we may think of 9 and as forms on three 
space which depend on time. 

We have Jfr 2 = id on one forms and on three forms which checks with 

Ht(cdt Adx A dy) = —dz 

or, more generally, 

*{cdt Afi) = -*a 
11.4 Electromagnetism. 

We begin with two regimes in which we solely use the star operator on R 3 . 
Then we will pass to the full relativistic theory. 

11.4.1 Electrostatics. 

The objects of the theory are: 

a linear differential form, E, called the electric field strength. A point charge 
e experiences the force eE. The integral of E along any path gives the voltage 
drop along that path. The units of E are 

voltage energy 
length charge • length ' 

The dielectric displacement, D, which is a two form. In principle, we could 
measure D(vi,v 2 ) where Vi,v 2 <G TR 3 ~ R 3 are a pair of vectors as follows: 
construct a parallel-plate capacitor whose plates are metal parallelograms de- 
termined by hv\,hv2 where ft is a small positive number. Place these plates 
with the corner at x touch them together, then separate them. They acquire 
charges ±Q. The orientation of R 3 picks out one of these two plates which we 
call the top plate. Then 

charge on top plate 
D(vi,v 2 ) = hm — . 

The units of D are 

charge 
area 
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The charge density which is a three form, p. (We identify densities with 
three forms since we have an orientation.) 

The key equations in the theory are: 

dE = 

which, in a simply connected region implies that that E = —du for some func- 
tion, u called the potential. 

The integral of D over the boundary surface of some three dimensional region 
is the total charge in the region Gauss ' law: 

JdU JU 

which, by Stokes, can be written differentially as 

dD = p. 

(I will use units which absorb the traditional Air into p.) 

Finally there is a constituitive equation relating E and D. In an isotropic: 
medium it is given by 

D = e*E 

where e is called the dielectric factor. In a homogeneous medium it is a con- 
stant, called the dielectric constant. In particular, the dielectric constant of the 
vacuum is denoted by eo- The units of eo are 

charge charge • length (charge) 2 
area energy energy • length ' 

The laws of electrostatics, since they involve the star operator, determine the 
three dimensional Euclidean geometry of space. 

11.4.2 Magnetoquasistatics. 

In this regime, it is assumed that there are no static charges, so p = 0, and that 
Maxwell's term dD/dt can be ignored; energy is stored in the magnetic field 
rather than in capacitors. 

The fundamental objects are: 

a one form E giving the electric force field. The force on a charge e is eE,as 
before. 

a two form B giving the magnetic induction or the magnetic flux density. 
The force on a current element I (which is a vector) is i(I)B where i denotes 
interior product. 
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The current flux, J which is two form [measured in (amps) /(area)]. 

a one form, H called the magnetic excitation or the magnetic field. The 
integral of H over the boundary, C of a surface S is equal to the flux of current 
through the surface. This is Ampere 's law. 

f H = f J (11.12) 
Jc Js 

Faraday's law of induction says that 

B= [ E. (11.13) 



d_ 
dt 



c 



By Stokes' theorem, the differential form of Ampere's law is 

dH=J, (11-14) 

and of Faraday's law is 

-dt = ~ dK (1L15) 

Faraday's law implies that the time derivative of dB vanishes. But in fact 
we have the stronger assertion (Hertz's law) 

dB = 0. (11.16) 

Equations (11.14, (11.15), and (11.16) are the structural laws of electrody- 
namics in the magnctoquasistatic approximation. We must supplement them 
by constituitive equations. One of these is 

B = f i*H, (11-17) 

where ★ denotes the star operator in three dimensions. 
According to Ampere's law, H has units 

charge 
time • length 

while according to Faraday's law B has units 

energy • time 



so that ix has units 
Thus e • ll has units 



charge • (length) 2 

energy • (time) 2 
(charge) 2 • length' 

(time) 2 



(length) 5 



(velocity) " 
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and it was Maxwell's great discovery, the foundation stone of all that has hap- 
pened since in physics, that 




where c is the speed of light. (This discussion is a bit premature in our present 
regime of quasimagnetostatics where D plays no role.) 

We need one more constituitive equation, to relate the current to the elec- 
tromagnetic field. In ordinary conductivity, one mimics the equation 

V = RI 

for a resistor in a network by Ohm 's law. 

J = a-kE. (11.18) 

According to the Drude theory (as modified by Sommerfcld) the charge carri- 
ers arc free electrons and a can be determined scmi-cmpirically from a model 
involving the mean free time between collisions as a parameter. Notice that 
in ordinary conductivity the charge carrier is something external to the elec- 
tromagnetic field, and a is not regarded as a fundamental constant of nature 
(like c, say) but is an empirical parameter to be derived from another theory, 
say statistical mechanics. In fact, Drude proposed the theory of the free elec- 
tron gas in 1900, some three years after the discovery of the electron, by J.J. 
Thompson, and it had a major success in explaining the law of Wiedemann and 
Franz, relating thermal conductivity to electrical conductivity. However, if you 
look at the lengthy article on conductivity in the 1911 edition of the Encyclo- 
pedia Britannica, written by J.J. Thompson himself, you will find no mention 
of electrons in the section on conductivity in solids. The reason is that Drude's 
theory gave absolutely the wrong answer for the specific heat of metals, and 
this was only rectified in 1925 in the brilliant paper by Sommerfcld where he 
replaces Maxwell Boltzmann statistics by the Fermi-Dirac statistics. All this 
is explained in a solid state physics course. I repeat my main point - a is not 
a fundamental constant and the source of J is external to the electromagentic 
fields. 

11.4.3 The London equations. 

In the superconducting domain, it is natural to mimic a network inductor which 
satisfies the equation 

dt 

So the Londons (1933) introduced the equation 



(11.19) 
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where A is an empirical parameter similar to the conductance, but the analogue 
of inductance of a circuit clement. Equation (11.19) is known as the first London 
equation. If we assume that A is a constant, we have 

— *dH = *— = —E. 
dt dt A 

Setting H = \jT x * B, applying d, and using (11.15) we get 



dt 



(d*d*B+ jB"j = 0. (11.20) 



From this one can deduce that an applied external field will not penetrate, but 
not the full Meissner effect expelling all magnetic fields in any superconduct- 
ing region. Here is a sample argument about the non-penetration of imposed 
magnetic fields into a superconducting domain: Since dB = we can write 

d*d*B = (d*d* + *d*d)B = —AB, 

where A is the usual three dimensional Laplacian applied to the coefficients of B 
(using(ll.ll)). Suppose we have a situation which is invariant under translation 
in the x and z direction. For example an infinite slab of width 2a with sides at 
y = ±a parallel to the y — plane. Then assuming the solution also invariant, 
(11.20) becomes 

' H d 2 \ dB 



A dy 2 J dt °" 
If we assume symmetry with respect to y = in the problem, we get 

f =^)cosh|, 



where 



A 



is called the penetration depth of the superconducting material. It is typically 
of order .1/xm. Suppose we impose some time dependent external field which 
takes on the the value 

b(t)dx A dy, 

for example, on the surface of the slab. Continuity then gives 

— = b'(t) C ° Shy / X 
dt cosh a I A 

The quotient on the right decays exponentially with penetration y/X. So 
externally applied magnetic fields do not penetrate, in the sense that the time 
derivative of the magnetic flux vanishes exponentially within a few multiples of 
the penetration depth. But the full Meissner effect says that all magnetic fields 
in the interior are expelled. 
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So the Londons proposed strengthening (11.20) by requiring that the expres- 
sion in parenthesis in (11.20) be actually zero, instead of merely assuming that 
it is a constant. Since d* B = [idH = \iJ (assuming that \i is a constant) we 
get 

d*J=-\B. (11-21) 
A 

Equation (11.21) is known as the second London equation. 

11.4.4 The London equations in relativistic form. 

We can write the two London equations in relativistic form, by letting 

j = - J Adt 

be the three form representing the current in space time. In general, we write 

j:=p-JAdt (11.22) 

as the three form in space time giving the relativistic "current" , but in the 
quasistatic regime p = 0. 
We have 

*i = - * J, 

c 

a one form on space time with no dt component (under our assumption of the 
absence of static charge in our space time splitting) . So 

urn \ , d-kJ 
cd(1tj) = d space * J — A dt, 

where the d on the left is the full d operator on space time. (From now on, until 
the end of this handout, we will be in space-time, and so use d to denote the full 
d operator in four dimensions, and use d space to denote the three dimensional d 
operator.) 

We recall that in the relativistic treatment of Maxwell's equations, the elec- 
tric field and the magnetic induction are combined to give the electromagnetic 
field 

F = B + E A dt 

so that Faraday's law, (11.15), and Hertz's law, (11.16) are combined into the 
single equation, 

dF = 0, (11.23) 

known as the first Maxwell equation. We see that the two London equations 
can also be combined to give 

d*cAj = —F, (11.24) 

which implies (11.23). This suggests that superconductivity involves modifying 
Maxwell's equations, in contrast to ordinary conductivity which is supplemen- 
tary to Maxwell's equations. 
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11.4.5 Maxwell's equations. 

To see the nature of this modification, we recall the second Maxwell equation 
which involves the two form 

G = D - H A dt 
where D is the "dielectric displacement" , as above. Recall that 

dspaceD 

gives the density of charge according to Gauss' law. The second Maxwell equa- 
tion combines Gauss' law and Maxwell's modification of Ampere's law into the 
single equation 

dG = j, (11.25) 

where the three current, j is given by (11.22). The product (e^)" 1 / 2 has the 
units of velocity, as we have seen, and let us us assume that we arc in the 
vacuum or in a medium for which that this velocity, is c, the same value as the 
vacuum. So using the corresponding Lorentz metric on space time to define our 
& operator the combined constituitive relations can be written as 

G = -— *F, 

CfJ, 

or using units where c = 1 more simply as 

G = --+F- (11.26) 
M 

From now on, wc will use "natural" units in which c = 1 and in which energy 
and mass have units (length) -1 . 

11.4.6 Comparing Maxwell and London. 

The material in this subsection, especially the comments at the end, might be 
acceptable in the mathematics department. You should be warned that they 
do not reflect the currently accepted physical theories of superconductivity, and 
hence might encounter some trouble in the physics department. 

In classical electromagnetic theory, j is regarded as a source term in the 
sense that one introduces a one form, A, the four potential, with 

F = -dA 

and Maxwell's equations become the variational equations for the Lagrangian 
with Lagrange density 

£m(AJ) = ^dAA+dA- nAA j. (11.27) 
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This means the following: £m(A,j) is a four form on R 1 ' 3 and we can imagine 
the "function" 

L M (Ai)":=" / C M (A,j). 

It is of course not defined because the integral need not converge. But if C is 
any smooth one form with compact support, the variation 

d( L M)(A,j)[C] := ^L M (A + sC,j) | s=0 

is well defined, and the condition that this variation vanish for all such C gives 
Maxwell's equations. 



9. Show that these variational equations do indeed give Maxwell's equations. 
Use d(C A UtA) = dC A *cL4 - C A d+dA and the fact that r A *w = w A *r for 
two forms. 



In particular, one has gauge invariance: A is only determined up to the 
addition of a closed one form, and the Maxwell equations become 

d*dA = M. (11.28) 

For the London equations, if we apply £ to (11.24) and use (11.26) we get 

and so by the second Maxwell equation, (11.25) we have 

= ^j. (11.29) 

We no longer restrict j by requiring the absence of stationary charge, but do 
observe that "conservation of charge", i.e. dj — is a consequence of (11.29). 
If we set 

*Aj - A, (11.30) 

we see that the Maxwell Lagrange density (11.27) is modified to become the 
"Proca" Lagrange density 

C L (A) = X - (dAK^dA- ^AN*a\ . (11.31) 



lO.Vcrify this. 
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A number of remarks are in order: 

1. The London equations have no gauge freedom. 

2. The Maxwell equations in free space (that is with j = 0) are conformally 
invariant. This is a general property of the star operator on middle degrees, in 
our case from A 2 to A 2 , as we have seen. But the London equations involve the 
star operator from A 1 to A 3 and hence depend on, and determine, the actual 
metric and not just on the conformal class. This is to be expected in that the 
Meissner effect involves the penetration depth, A. 

3. Since the units of A are length, the units of 1/A 2 are (mass) 2 as is to be 
expected. So the London modification of Maxwell's equations can be expressed 
as the addition of a masslike term to the massless photons. In fact, substituting 
a plane wave with four momentum k directly into (11.29) shows that k must lie 
on the mass shell k 2 = 1/A 2 . 4. Since the Maxwell equations are the mass zero 
limit of the Proca equations, one might say that the London equations represent 
the more generic situation from the mathematical point of view. Perhaps the 
"true world" is always superconducting and we exist in some limiting case where 
the photon can be considered to have mass zero. 

5. On the other hand, if one starts from a firm belief in gauge theories, then 
one would regard the mass acquisition as the result of spontaneous symmetry 
breaking via the Higgs mechanism. In the standard treatment one gets the Higgs 
field as the spin zero field given by a Cooper pair. But since the electrons are 
not needed for charge transport, as no external source term occurs in (11.29), 
one might imagine an entirely different origin for the Higgs field. Do we need 
electrons for superconductivity? We don't use them to give mass to quarks or 
leptons in the standard model. 



